The emerging 5G services offer numerous new opportunities for networked applications. In this study, we seek to answer two key questions: i) is the throughput of mmWave 5G predictable, and ii) can we build "good" machine learning models for 5G throughput prediction? To this end, we conduct a measurement study of commercial mmWave 5G services in a major U.S. city, focusing on the throughput as perceived by applications running on user equipment (UE).




Lumos5G 1.0 is a dataset that represents the `Loop` area of the IMC'20 paper - "Lumos5G: Mapping and Predicting Commercial mmWave 5G Throughput". The Loop area is a 1300 meter loop near U.S. Bank Stadium in Minneapolis downtown area that covers roads, railroad crossings, restaurants, coffee shops, and recreational outdoor parks.

This dataset is being made available to the research community.


The description of the columns in the dataset CSV, from left to right, are:

- `run_num`: Indicates the run number. For each trajectory and mobility mode, we conduct several runs of experiments.
- `seq_num`: This is the sequence number. For each run, the sequence number acts like an index or a per-second timeline.
- `abstractSignalStr`: Indicates the abstract signal strength as reported by Android API ( No matter whether the UE was connected to 5G service or not, this column always reported a value associated with the LTE/4G radio. Note, if one is interested to understand the signal strength values related to 5G-NR, we refer them to other columns such as `nr_ssRsrp`, `nr_ssRsrq`, and `nr_ssSinr`.
- `latitude`: The latitude in degrees as reported by Android's API (
- `longitude`: The longitude in degrees as reported by Android's API (
- `movingSpeed`: The ground mobility/moving speed of the UE as reported by Android's API ( The unit is meters per second.
- `compassDirection`: The bearing in degrees as reported by Android's API ( Bearing is the horizontal direction of travel of this device, and is not related to the device orientation. It is guaranteed to be in the range `(0.0, 360.0]` if the device has a bearing.
- `nrStatus`: Indicates if the UE was connected to 5G network or not. When `nrStatus=CONNECTED`, the UE was connected to 5G. All other values of `nrStatus` such as `NOT_RESTRICTED` and `NONE` indicate the UE was not connected to 5G. `nrStatus` was obtained by parsing the raw string representation of `ServiceState` object (
- `lte_rssi`: Get Received Signal Strength Indication (RSSI) in dBm of the primary serving LTE cell. The value range is [-113, -51] inclusively or CellInfo#UNAVAILABLE if unavailable. Reference: TS 27.007 8.5 Signal quality +CSQ.
- `lte_rsrp`: Get reference signal received power (RSRP) in dBm of the primary serving LTE cell.
- `lte_rsrq`: Get reference signal received quality (RSRQ) of the primary serving LTE cell.
- `lte_rssnr`: Get reference signal signal-to-noise ratio (RSSNR) of the primary serving LTE cell.
- `nr_ssRsrp`: Obtained by parsing the raw string representation of `SignalStrength` object ( `nr_ssRsrp` was a field in this object's `CellSignalStrengthNr` section. In general, this value was only available when the UE was connected to 5G (i.e., when `nrStatus=CONNECTED`). Reference: 3GPP TS 38.215. Range: -140 dBm to -44 dBm.
- `nr_ssRsrq`: Obtained by parsing the raw string representation of `SignalStrength` object ( `nr_ssRsrq` was a field in this object's `CellSignalStrengthNr` section. In general, this value was only available when the UE was connected to 5G (i.e., when `nrStatus=CONNECTED`). Reference: 3GPP TS 38.215. Range: -20 dB to -3 dB.
- `nr_ssSinr`: Obtained by parsing the raw string representation of `SignalStrength` object ( `nr_ssSinr` was a field in this object's `CellSignalStrengthNr` section. In general, this value was only available when the UE was connected to 5G (i.e., when `nrStatus=CONNECTED`). Reference: 3GPP TS 38.215 Sec 5.1.*, 3GPP TS 38.133 Range: -23 dB to 40 dB
- `Throughput`: Indicates the throughput perceived by the UE. iPerf 3.7 was used to measure the per-second TCP downlink at the UE.
- `mobility_mode`: Indicates the grouth truth about the mobility mode when the experiment was conducted. This value can either be walking or driving.
- `trajectory_direction`: Indicates the ground truth about the trajectory direction of the experiment conducted at the Loop area. `CW` indicates clockwise direction, while `ACW` indicates anti-clockwise. Note, the driving experiments were only conducted in `CW` direction as certain parts of the loop were one way only. Walking-based experiments were conducted in both directions.
- `tower_id`: Indicates the (anonymized) tower identifier.

Note: We found that availability (and at times even the values) of `lte_rssi`, `nr_ssRsrp`, `nr_ssRsrq` and `nr_ssSinr` were not reliable. Since these values were sampled every second, at certain times (e.g., boundary cases), we might still find NR-related values when `nrStatus` is not equal to `CONNECTED`. However, in this dataset, we still include all the raw values as reported by the APIs.


author = {Narayanan, Arvind and Ramadan, Eman and Mehta, Rishabh and Hu, Xinyue and Liu, Qingxu and Fezeu, Rostand A. K. and Dayalan, Udhaya Kumar and Verma, Saurabh and Ji, Peiqi and Li, Tao and Qian, Feng and Zhang, Zhi-Li},
title = {Lumos5G: Mapping and Predicting Commercial MmWave 5G Throughput},
year = {2020},
isbn = {9781450381383},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {},
doi = {10.1145/3419394.3423629},
booktitle = {Proceedings of the ACM Internet Measurement Conference},
pages = {176–193},
numpages = {18},
keywords = {bandwidth estimation, mmWave, machine learning, Lumos5G, throughput prediction, deep learning, prediction, 5G},
location = {Virtual Event, USA},
series = {IMC '20}


Please feel free to contact the FiveGophers/Lumos5G team for questions or information about the data (,,,,


Lumos5G 1.0 dataset is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this license, visit or send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.


We conduct to our knowledge a first measurement study of commercial 5G performance on smartphones by closely examining 5G networks of three carriers (two mmWave carriers, one mid-band 5G carrier) in three U.S. cities. We conduct extensive field tests on 5G performance in diverse urban environments. We systematically analyze the handoff mechanisms in 5G and their impact on network performance, and explore the feasibility of using location and possibly other environmental information to predict the network performance.




5Gophers 1.0 is a dataset collected when the world's very first commercial 5G services were made available to consumers. It should serve as a baseline to evaluate the 5G's performance evolution over time. Results using this dataset is presented in our measurement paper - "A First Look at Commercial 5G Performance on Smartphones".

This dataset is being made available to the research community.


All the files are in CSV format with headers that should hopefully be self-explainatory.

├── All-Carriers
│   ├── 01-Throughput
│   ├── 02-Round-Trip-Times
│   └── 03-User-Mobility
└── mmWave-only
├── 03-UE-Panel (LoS Tests)
├── 04-Ping-Traces (Latency Tests)
├── 05-UE-Panel (NLoS Tests)
├── 06-UE-Panel (Orientation Tests)
├── 07-UE-Panel (Distance Tests)
├── 08-Web-Page-Load-Tests
├── 09-HTTPS-CDN-vs-NonCDN (Download Test)
└── 10-HTTP-vs-HTTPS (Download Test)


author = {Narayanan, Arvind and Ramadan, Eman and Carpenter, Jason and Liu, Qingxu and Liu, Yu and Qian, Feng and Zhang, Zhi-Li},
title = {A First Look at Commercial 5G Performance on Smartphones},
year = {2020},
isbn = {9781450370233},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {},
doi = {10.1145/3366423.3380169},
booktitle = {Proceedings of The Web Conference 2020},
pages = {894–905},
numpages = {12},
location = {Taipei, Taiwan},
series = {WWW ’20}


Please feel free to contact the FiveGophers team for information about the data (,


5Gophers 1.0 dataset is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this license, visit or send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.


This dataset contains thousands of Channel State Information (CSI) samples collected using the 64-antenna KU Leuven Massive MIMO testbed. The measurements focused on four different antenna array topologies; URA LoS, URA NLoS, ULA LoS and, DIS LoS. The users channel is collected using CNC-tables, resulting in a dataset where all samples are provided with a very accurate spatial label. The user position is sweeped across a 9 squared meter area, halting every 5 millimeter, resulting in a dataset size of 252,004 samples for each measured topology.


The dataset contains Channel State Information (CSI) samples, recorded with the KU Leuven Massive MIMO testbed. This was done for many user positions laying on a grid. The Base Station (BS) is equipped with 64 antennas, each receiving a predefined pilot signal from each position. Using these pilot signals, the CSI is estimated for 100 subcarriers, evenly spaced in frequency over a 20 MHz bandwidth. As a result, the complex numbered matrix H represents the measured CSI for one location. This matrix spans N rows and K columns, with N being the number of BS antennas and K the number of subcarriers. For further details about the system, the National Instruments Massive MIMO Application Framework documentation can be consulted.  

To collect the CSI from many different user locations, four single-antenna User Equipments were positioned in an office. Their antennas were moved along a predefined route using CNC XY-tables. This route zigzagged along a grid taking steps of 5 mm. The total grid spans 1.25 m by 1.25 m. By using these XY-tables the error on the positional label is less than 1 mm, which results in a very accurate dataset. This resulted in a dataset containing 252004 CSI samples spatially labelled with an accuracy of less than 1 mm.  
Furthermore, the testbed’s BS is designed to be very flexible in the deployment of the antenna array. This allowed for the creation of three different datasets, each with a unique antenna deployment. First, a Uniform Rectangular Array (URA) of 8 by 8 antennas was deployed, both in LoS and NLoS, using a metal blocker. Second, Uniform Linear Array (ULA) of 64 antennas on one line was deployed. Finally, the antennas were distributed over the room in pairs of eight, making up the distributed (DIS) scenario.  

The different deployments can be seen in the picture in attachment. In all cases, the antennas are placed 1 m from the XY-tables. The yellow rectangles on the figure depict the 1.25 m by 1.25 m areas where the XY-tables are able to move the users in. The spacing in between the XY-tables is dictated by the space needed for the motors powering the movements and the cables connecting them to the controllers. These tables were synchronised over Ethernet with the BS to ensure the sampled H has a correct spatial label, enabling a highly accurate dataset. 

During the measurements, the BS was configured to use a centre frequency of 2.61 GHz, giving a wavelength λ of 114.56 mm. The system used a bandwidth of 20 MHz. The origin of the space was defined as the middle of the URA. From this point in space, the x- and y-positions of the users and antennas were measured. These locations are provided in 3D in the dataset.


This dataset contains the measurement data of a channel sounding campaign in the hull of a bulk carrier vessel at mmWave frequency 60.48 GHz. The directive radio channel for Line-of-Sight (LOS) communication is measured using the Terragraph channel sounder. An antenna beam width dependent PL model is created. At mmWave frequencies, LOS PL in the vessel is close to PL in a free space environment, but angular spread values are lower compared to other indoor scenarios.

The processing results of these measurements are presented in the following two papers.


We performed Line-of-Sight measurements in the hull of a bulk carrier vessel using the Terragraph channel sounder. The measurements are performed in the engine room and steering control room of the Premium Do Brasil, a 200 m long juice carrier. We selected 6 locations and measured at different distances in order to have a measurement every 0.25 m and 2 measurements (at 2 different locations) for every 0.5 m. The measurement data can be found in the ZIP archive, which also contains some pictures in the **Setup** directory.

The **MeasurementData** directory of the ZIP archive contains a single folder for every measurement, with the following naming structure: X_locY_Z in which:
X: Date of measurement
Y: Location of measurement (see **Setup** folder)
Z: Distance between the two nodes

Every folder contains the log file with the configuration settings, the results and normalized results files and a plot of the beam sweep path loss info. No GPS info is recorded. All the path loss data is combined and fitted to a one slope model.


This dataset was created for the following paper: Seonghoon Jeong, Boosun Jeon, Boheung Chung, and Huy Kang Kim, "Convolutional neural network-based intrusion detection system for AVTP streams in automotive Ethernet-based networks," Vehicular Communications, DOI: 10.1016/j.vehcom.2021.100338.



The following devices are connected to the automotive Ethernet testbed:

  • a RAD-Galaxy: BroadR-Reach switch
  • two neoECU AVB/TSN (AVB/TSN Endpoint Simulation): configured as an AVB talker and an AVB listener, respectively
  • a RAD-Moon: a media converter (between BroadR-Reach and Ethernet)
  • an USB Camera connected to the AVB talker

The dataset contains four benign (attack-free) packet captures. 

  • driving_01_original.pcap (about 10 min)
  • driving_02_original.pcap (about 16 min)
  • indoors_01_original.pcap (about 24 min)
  • indoors_02_original.pcap (about 21 min)


We suppose that an attacker injects arbitrary stream AVTP data units (AVTPDUs) into the IVN. The goal of the attacker is to output a single video frame, at a terminal application connected to the AVB listener, by injecting previously generated AVTPDUs during a certain period. To demonstrate the attack, we extract 36 continuous stream AVTPDUs (single-MPEG-frame.pcap) from one of our AVB datasets; the extracted AVTPDUs constitute one video frame. Then, the attacker performs a replay attack by sending the 36 stream AVTPDUs repeatedly. Check *_injected.pcap files for the result of the replay attack.


To open the packet captures, we recommend researchers use Wireshark and the following plug-ins:



This work was supported by Institute for Information & communications Technology Promotion (IITP) grant funded by the Korea government (MSIT) (No. 2018-0-00312, Developing technologies to predict, detect, respond, and automatically diagnose security threats to automotive Ethernet-based vehicle).



The MIMOSigRef-SD dataset was created with the goal to support the research community in the design and development of novel multiple-input multiple-ouotput (MIMO) transceiver architectures. It was recorded using software radios as transmitters and receivers, and a wireless channel emulator to facilitate a realistic representation of a variety of different channel environments and conditions.


The MIMOSigRef-SD dataset is provided in the form of 8 individual TAR files. Each file represents one of the 8 modulation schemes utilized in our dataset. Within each file, the data is organized in a similar format: Naming of the contained folders represents the modulation scheme and order, channel environment, MIMO configuration, and type of MIMO. An example of such naming is: 16QAM – Vehicular B – (TX2 – RX1) – Spatial Diversity. This provides easy access to the information of interest.


Accurate and efficient anomaly detection is a key enabler for the cognitive management of optical networks, but traditional anomaly detection algorithms are computationally complex and do not scale well with the amount of monitoring data. Therefore, this dataset enables research on new optical spectrum anomaly detection schemes that exploit computer vision and deep unsupervised learning to perform optical network monitoring relying only on constellation diagrams of received signals.


The dataset contains a set of folders, each one representing one normal/anomalous case.

Within each folder, a number of .mat files contain the raw data collected from VPITransmissionMaker. The images folder contains the rendered constellation diagrams.

To render your own constellation diagrams, check the "generate_plots.m" file in the root folder.

More information on how to use in the GitHub repository.


Communication of selected devices captured on LAN. The communication is stored in pcapng format. Files captures device's communication such as: connection to a network, configuration and other activities specific to each device.

Selected devices are:



- NETATMO smart radiators valves

- BML smart IP camera


Bluetooth communication is widely adopted in IoMT devices due to its various benefits. Nevertheless, because of its simplicity as a personal wireless communication protocol, Bluetooth lacks the security mechanisms which may result in devastating outcomes for patients treated using wireless medical devices.


The prototype transmitter was placed on the floating platform for generating the inductive magnetic field, and the distance between the rotating magnet and the sea surface is 35 cm to 45 cm. The waterproof ferrite-rod coil was hung in the seawater as a receiving antenna at a depth of 10 m.