CRAWDAD umd/sigcomm2008

Citation Author(s):
Aaron
Schulman
University of Maryland
Dave
Levin
University of Maryland
Neil
Spring
University of Maryland
Submitted by:
CRAWDAD Team
Last updated:
Wed, 03/25/2009 - 08:00
DOI:
doi.org/10.15783/C7J59R
Data Format:
License:
91 Views
Categories:
Keywords:
0
0 ratings - Please login to submit your rating.

Abstract 

We collected a trace of wireless network activity at SIGCOMM 2008. The subjects of the traced network chose to participate by joining the traced SSID. The release contains 3 types of anonymized traces: 802.11a, Ethernet and Syslog from the Access Point. We anonymized the trace data using a modified version (http://www.cs.umd.edu/projects/wifidelity/sigcomm08_traces/sigcomm08-tcp...) of the tcpmkpub tool (http://www.icir.org/enterprise-tracing/tcpmkpub.html) The packet traces include anonymized DHCP and DNS headers.

last modified : 2009-03-25

release date : 2009-03-02

date/time of measurement start : 2008-08-17

date/time of measurement end : 2008-08-21

collection environment : We collected a trace of wireless network activity at SIGCOMM 2008. The subjects of the traced network chose to participate by joining the traced SSID. Our goal is to gather a detailed trace of network activity at SIGCOMM 2008 to improve 802.11 tracing techniques as part of the Wifidelity project and enable analysis of the behavior of a wireless LAN that is (presumably) heavily used.

network configuration : We used four BSSIDs on four channels with one NAT (Network Address Translation) router. To collect the traces, we deployed eight 802.11a monitors so 2 monitors are assigned to each channel. A Xirrus Wi-Fi Array (http://www.xirrus.com/products/arrays-80211abg.php) provided the traced 802.11a network (SSID:SIGCOMM-ONLY-Traced). The WiFi Array consisted of four BSSIDs that were broadcast on four 802.11a channels. After anonymization, the DHCP assigned IP addresses for clients are in the following subnets: 26.12.0.0/16 and 26.2.0.0/16.

data collection methodology : We recorded network protocol information from all wired and wireless packets sent on the wireless network of SSID:SIGCOMM-ONLY-Traced. Each packet includes physical layer information (in the Prism header) such as the wireless signal strength as well as the 802.11, IP, TCP, UDP, and ICMP headers, depending on the packet type. We did not record packet payloads above the transport layer except for DHCP and DNS payloads. However, we anonymized or deleted potentially sensitive information such as MAC and IP addresses, and DHCP and DNS headers.

sanitization : The user chose to participate in the trace by associating with the SIGCOMM-ONLY-Traced SSID. Otherwise, the users joined the "Untraced" SSID: SIGCOMM-ONLY-Untraced. The traces do not contain any data from the "Untraced" SSID. We anonymized the traces to protect the identity and activity of users who opted to be traced during SIGCOMM 2008. - Filtering 802.11a traces Each packet in the wireless traces meets one or both of the following criteria: 1. BSSID address matches the "traced" BSSID. 2. Packet is a probe request for the "SIGCOMM-ONLY-Traced" SSID. - Filtering Ethernet traces The AP was set up with a monitor VLAN for the "SIGCOMM-ONLY-Traced" network. - Filtering Syslog traces The syslog trace only contains information about users associated with the "traced" network. The method to filter out syslog messages about "Untraced" users is as follows: Include all syslog messages while a client is associated to the "traced" network. The syslog messages indicate when a client associates to, and disassociates from the "traced" network.

Traceset

umd/sigcomm2008/pcap

PCAP traceset of wireless network measurement in SIGCOMM 2008 conference.

  • file: sigcomm08_traces.tar.gz
  • description: We collected pcap traces of wireless network activity at SIGCOMM 2008. The subjects of the traced network chose to participate by joining the traced SSID.
  • measurement purpose: Network Diagnosis
  • methodology: 1. 802.11a During most of the conference approximately two 802.11a monitors were placed at the four corners of the main conference hall. We did not record the exact location of each monitor. However, we tried to capture each channel with two monitors placed at opposite corners of the room. 2. Ethernet Packets sent from the NAT to the AP and from the AP to the NAT were captured using an Ethernet trace collector attached to the packet dump port on the WiFi Array.
  • sanitization: The packets are anonymized using a modified version of the tcpmkpub tool. The tool is available from the download link of [sigcomm08-tcpmkpub.tar.gz]. Metadata about the trace anonymization is provided in the file tcpmkpub.log.export. In the description below, [new] indicates new functionality added to tcpmkpub, and [tcpmkpub] indicates the functionality of the original tcpmkpub tool, described in the following reference: R. Pang, M. Allman, V. Paxson, and J. Lee. The Devil and Packet Trace Anonymization SIGCOMM Computer Communication Review, 2006. [Crypto-PAn] indicates the functionality of the original tcpmkpub tool, described in the following reference: Xu, J. Fan, M. H. Ammar, and S. B. Moon. Prefix-preserving IP address anonymization: measurement-based security evaluation and a new cryptography-based scheme. In Proceedings of the IEEE International Conference on Network Protocols (ICNP), pages 280–289, Nov. 2002. 1. Checksums (IP/UDP/TCP) [tcpmkpub] The anonymization code recomputes checksums. The anonymization meta-data (tcpmkpub.log.export) holds information about packets in the traces with bad checksums. Bad checksums are indicated in the anonymized traces by a 1 in the checksum field, or 2 if the checksum was 1, A UDP checksum of 0 is not changed. 2. Link Layer A. Ethernet [tcpmkpub] MAC Addresses: - The 3 high and low-order bytes are hashed separately. - The high-order 3 bytes are hashed to retain vendor information. - Addresses containing all 1's or all 0's are not changed. - The Multicast bit is retained. B.VLAN [new] The vlan header did not need to be anonymized. C. 802.11 [new] - MAC addresses are anonymized using the same method as the Ethernet MAC addresses. - If the packet is fragmented (fragment bit == 1 or fragment # > 0), skip the rest of the packet. 3. Network Layer A. IP [tcpmkpub] - External addresses hashed using prefix preserving scheme [Crypto-PAn]. - Internal addresses hashed to unused prefix by the external addresses and the subnet and host portions of the address are transformed. - Multicast addresses are not anonymized. - The [tcpmkpub] paper recommends removing packets from network scanners. We did not determine this was a threat to our network as the identity tied to a local address was dynamic. B. ARP [tcpmkpub] - If the ARP packet contains a partial IP packet, use the IP anonymization above. - IP addresses anonymized using the IP anonymization procedure above. 4. Transport Layer A. TCP [tcpmkpub] - The TCP timestamp options are transformed into separate monotonically increasing counters with no relationship to time for each IP address in the anonymized trace. - If timestamp is 0 do not modify it. - Replace timestamp with a unique number incremented in the order of the trace. B. UDP [tcpmkpub] Recompute checksum according to checksum policy above. 5. Application Layer A. DNS [new] - Anonymize DNS labels individually by taking the Keyed-HMAC of the label. - Keep the low-order 8 bytes of the hash digest as the label. - Convert the digest to ASCII by converting to hex. - Store the new length of the DNS packet in the following fields: [IP/UDP/DNS,PCAP Captured, PCAP On Wire]. - Anonymize any type 'A' resource record data using the IP anonymization scheme above. DNS Packets may be cut off because of the snaplen at capture. B. DHCP [new] - Client IP address is anonymized. - Client hardware address is anonymized. - Your IP address (yiaddr) is anonymized. The rest of the DHCP packets were cut off by the snaplen at capture.

umd/sigcomm2008/pcap Traces

  • 802.11a: PCAP traces of wireless network measurement collected from the wireless side in SIGCOMM 2008 conference.
    • configuration: During most of the conference approximately two 802.11a monitors were placed at the four corners of the main conference hall. We did not record the exact location of each monitor. However, we tried to capture each channel with two monitors placed at opposite corners of the room. The network topology is configured as follows: Users: 26.12.*.* 26.2.*.* Network Management: 26.6.*.*
    • format:

      sigcomm08_wl_(monitor #)_(first packet time)_(last packet time)_(bssid)_(channel).pcap

  • Ethernet: PCAP traces of wireless network measurement collected from the Ethernet side in the SIGCOMM 2008 conference.
    • configuration: Packets sent from the NAT to the AP and from the AP to the NAT were captured using an Ethernet trace collector attached to the packet dump port on the WiFi Array. The network topology is configured as follows: Users: 26.12.*.* 26.2.*.* Network Management: 26.6.*.*
    • format:

      sigcomm08_eth_(first packet time)_(last packet time).pcap

  • anonymization_log: The anonymization log of wireless network traces in the SIGCOMM 2008 conference.
    • configuration: tcpmkpub anonymization log for the traces 'umd/sigcomm2008/pcap/802.11a' and 'umd/sigcomm2008/pcap/Ethernet', and md5 checksums for the trace files.
    • format:

      The anonymization log file name is 'tcpmkpub.log.export'.

umd/sigcomm2008/syslog

Syslog traceset of wireless network measurement in the SIGCOMM 2008 conference.

  • file: sigcomm08_syslog.tar.gz
  • description: We collected syslog traces of wireless network activity at SIGCOMM 2008. The subjects of the traced network chose to participate by joining the traced SSID.
  • measurement purpose: Network Diagnosis
  • methodology: A tracing box connected to the Array's management port collected syslog traces. Unfortunately, after the conference we noticed that these traces were corrupted. However, we were able to salvage one of the syslog traces because we collected it with the Ethernet tracing box.
  • sanitization: macmkpub, a MAC address anonymizer based on the tcpmkpub anonymization code, anonymized the MAC addresses in the syslog traces. Metadata about the trace anonymization is provided in the file 'tcpmkpub.log.export'.

umd/sigcomm2008/syslog Traces

  • Ethernet: Syslog traces of wireless network measurement in the SIGCOMM 2008 conference.
    • configuration: We collected syslog traces with the Ethernet tracing box. The network topology is configured as follows: Users: 26.12.*.* 26.2.*.* Network Management: 26.6.*.*
    • format:

      sigcomm08_syslog_(first log time)_(last log time)

Instructions: 

The files in this directory are a CRAWDAD dataset hosted by IEEE DataPort. 

About CRAWDAD: the Community Resource for Archiving Wireless Data At Dartmouth is a data resource for the research community interested in wireless networks and mobile computing. 

CRAWDAD was founded at Dartmouth College in 2004, led by Tristan Henderson, David Kotz, and Chris McDonald. CRAWDAD datasets are hosted by IEEE DataPort as of November 2022. 

Note: Please use the Data in an ethical and responsible way with the aim of doing no harm to any person or entity for the benefit of society at large. Please respect the privacy of any human subjects whose wireless-network activity is captured by the Data and comply with all applicable laws, including without limitation such applicable laws pertaining to the protection of personal information, security of data, and data breaches. Please do not apply, adapt or develop algorithms for the extraction of the true identity of users and other information of a personal nature, which might constitute personally identifiable information or protected health information under any such applicable laws. Do not publish or otherwise disclose to any other person or entity any information that constitutes personally identifiable information or protected health information under any such applicable laws derived from the Data through manual or automated techniques. 

Please acknowledge the source of the Data in any publications or presentations reporting use of this Data. 

Citation:

Aaron Schulman, Dave Levin, Neil Spring, Dave Levin, Neil Spring, Neil Spring, umd/sigcomm2008, https://doi.org/10.15783/C7J59R , Date: 20090302

Dataset Files

Documentation

AttachmentSize
File umd-sigcomm2008-readme.txt1.62 KB

These datasets are part of Community Resource for Archiving Wireless Data (CRAWDAD). CRAWDAD began in 2004 at Dartmouth College as a place to share wireless network data with the research community. Its purpose was to enable access to data from real networks and real mobile users at a time when collecting such data was challenging and expensive. The archive has continued to grow since its inception, and starting in summer 2022 is being housed on IEEE DataPort.

Questions about CRAWDAD? See our CRAWDAD FAQ. Interested in submitting your dataset to the CRAWDAD collection? Get started, by submitting an Open Access Dataset.