A Dataset of Network Traffic Collected During Large-Scale Human Genome Sequence Analysis

- Citation Author(s):
-
Manas Das (University of Missouri)Khawar Shehzad (University of Missouri)Praveen Rao (University of Missouri)
- Submitted by:
- Praveen Rao
- Last updated:
- DOI:
- 10.21227/y0t5-1w13
- Data Format:
- Categories:
- Keywords:
Abstract
This dataset contains .pcap files collected during the execution of variant calling on large number of human genomes using a cluster. The GATK4 variant calling pipeline was executed using AVAH in two testbeds, CloudLab and FABRIC. A 16-node cluster was used on CloudLab, and an 8-node cluster was used on FABRIC. The files were collected by running tcpdump on the network interfaces of the nodes. One file was produced every 30 mins; a snapshot length of 94 bytes was specified for tcpdump. On CloudLab, bare metal serves were used for the cluster. On FABRIC, virtual machines were used for the cluster. Each .pcap file is named with a worker/host name and start time when the network traffic was collected for that file.
Instructions:
1. Download the .pcap.tar.gz files. The name of the testbed is provided as a prefix for each tar ball. Each tar ball corresponds to traffic send/received by one worker node in the cluster.
2. Unzip/untar the files using tar. For example, use: tar xvfz <testbed>-vm1.tar.gz
3. Use Wireshark (https://www.wireshark.org/) or tshark (https://www.wireshark.org/docs/man-pages/tshark.html) to analyze the traffic data.
Dataset Files
- Worker 1/CloudLab testbed (Size: 5.22 GB)
- Worker 2/CloudLab testbed (Size: 6.02 GB)
- Worker 3/CloudLab testbed (Size: 5.76 GB)
- Worker 4/CloudLab testbed (Size: 5.74 GB)
- Worker 5/CloudLab testbed (Size: 7.07 GB)
- Worker 6/CloudLab testbed (Size: 5.34 GB)
- Worker 7/CloudLab testbed (Size: 6.02 GB)
- Worker 8/Cloudlab testbed (Size: 6.24 GB)
- Worker 9/CloudLab testbed (Size: 6.17 GB)
- Worker 10/CloudLab testbed (Size: 9.07 GB)
- Worker 11/CloudLab testbed (Size: 5.81 GB)
- Worker 12/CloudLab testbed (Size: 9.15 GB)
- Worker 13/CloudLab testbed (Size: 5.27 GB)
- Worker 14/CloudLab testbed (Size: 5.69 GB)
- Worker 15/CloudLab testbed (Size: 6.79 GB)
- Worker 1/FABRIC testbed (Size: 3.63 GB)
- Worker 2/FABRIC testbed (Size: 3.13 GB)
- Worker 3/FABRIC testbed (Size: 5.46 GB)
- Worker 4/FABRIC testbed (Size: 3.28 GB)
- Worker 5/FABRIC testbed (Size: 6.78 GB)
- Worker 6/FABRIC testbed (Size: 3.36 GB)
- Worker 7/FABRIC testbed (Size: 3.42 GB)