CRAWDAD thlab/sigcomm2009

Citation Author(s):
Anna-Kaisa
Pietilainen
Paris Research Innovation Center - Technicolor
Christophe
Diot
Submitted by:
CRAWDAD Team
Last updated:
Sun, 07/15/2012 - 08:00
DOI:
10.15783/C70P42
License:
741 Views
Citations:
2
Categories:
Keywords:
0
0 ratings - Please login to submit your rating.

Abstract 

Traces of Bluetooth encounters, opportunistic messaging, and social profiles of 76 users of MobiClique application at SIGCOMM 2009.

The dataset contains data collected by an opportunistic mobile social application, MobiClique. The application was used by 76 persons during SIGCOMM 2009 conference in Barcelona, Spain. The data sets include traces of Bluetooth  device proximity, opportunistic message creation and dissemination, and the social profiles (friends and interests) of the participants.

date/time of measurement start: 2009-08-17

date/time of measurement end: 2009-08-21

collection environment: The data set was collected during at the SIGCOMM 2009 conference at Barcelona, Spain. Around 100 smartphones were distributed to a set of volunteers during the first two days of the conference. The participants were recruited on-site in conjunction of the conference registration. Each device was initialized with the social profile of the participant that included some basic information such as home city, country and affiliation. In addition, each participant was asked to log on to their Facebook profile in order to include the list of Facebook friends and interests in the social profile. The participants could edit the social profile before it was uploaded on the device and recorded in our traces. Each participant was instructed to keep the device with them and powered on at all times, and to use the MobiClique application for mobile social networking during the conference. The participants could also use the device as their personal mobile phone during the conference by installing their personal SIM card on it. The final trace contains data from 76 devices that show significant activity during the experiment.

network configuration: The network is a Bluetooth based opportunistic network created among the participating devices during the conference. Each device performs a periodic Bluetooth device discovery every 120+/-10.24 seconds for a duration of 10.24s to find out about nearby devices. Upon discovering new contacts, the devices form a RFCOMM link on a preconfigured channel for data communications. Both the Bluetooth name query and service discovery are disabled. The experimental hardware is an HTC s620 Windows Mobile smartphone. HTC s620 has a 200MHz TI processor, 64MB of RAM, 128MB of ROM and a MicroSD slot. The radio interfaces include a quad-band GSM/EDGE cellular radio, Bluetooth v1.2 and 802.11b/g. The Bluetooth radio is a class 2 device with a radio range of around 10-20 meters.

data collection methodology: Each device records the results of the periodic device discovery and all data communications (RFCOMM link setup and bytes send/received). In addition, the devices record details of the user's social profile and its evolution, and application level messaging. All traces are recorded constantly in text files on the device's SD memory card. All traces are timestamped based on the device clock and reported as a relative time in seconds since the start of the experiment, 17/08/2009 08:00. The device clocks are set manually to the same reference time at the beginning of the experiment.

sanitization: All sensitive identifiers including Facebook identifiers, social profile data and Bluetooth MAC addresses are replaced with random integer ids.

limitation: The social profiles, in particular the list of friends and interest groups, are not necessarily complete as the participants had a possibility to remove any details they wished before the data was uploaded on the device and recorded. This option was given for privacy reasons as the application shared all profile details with all nodes. The Bluetooth proximity data and RFCOMM data communications suffer from the known limitations of the Bluetooth technology. The device discovery process is slow and regularly misses some nearby devices and RFCOMM links (setup and transmission) fail often when there are many Bluetooth devices in range. The timestamps among different devices are not synchronized. The clocks are set manually to the same reference time at the beginning of the experiment, but there is significant clock drift visible in the final data. The traces can be synchronized based on mutual sightings and/or data transmission traces. Due to constantly running periodic Bluetooth device discovery and frequent data communications, the battery life of the devices was limited to about one day or less depending on other usage. Hence, the devices are active and collecting data during varying periods of time depending on how faithfully (or not) the device owner was charging the device.

Traceset

thlab/sigcomm2009/mobiclique

Traces of Bluetooth encounters, opportunistic messaging, and social profiles of 76 users of MobiClique application at SIGCOMM 2009.

  • file: sigcomm2009.tar.gz
  • description: The traceset contains data collected by an opportunistic mobile social application, MobiClique. The application was used by 76 persons during SIGCOMM 2009 conference in Barcelona, Spain. The data sets include traces of Bluetooth device proximity, opportunistic message creation and dissemination, and the social profiles (friends and interests) of the participants.
  • measurement purpose: User Mobility Characterization, Routing Protocol for DTNs (Disruption Tolerant Networks), Social Network Analysis, Human Behavior Modeling, Opportunistic Connectivity
  • methodology: Each device performs a periodic Bluetooth device discovery every 120+/-10.24seconds for a duration of 10.24s to find out about nearby devices. Upon discovering new contacts, the devices form a RFCOMM link on a preconfigured channel for data communications. Both the Bluetooth name query and service discovery are disabled. Each device records the results of the periodic device discovery and all data communications (RFCOMM link setup and bytes send/received). In addition, the devices record details of the user's social profile and its evolution, and application level messaging. All traces are recorded constantly in text files on the device's SD memory card. All traces are timestamped based on the device clock and reported as a relative time in seconds since the start of the experiment, 17/08/2009 08:00. The device clocks are set manually to the same reference time at the beginning of the experiment.

thlab/sigcomm2009/mobiclique Traces

  • participants: List of participants and basic social profiles including home city, country, and affiliation.
    • configuration: This data was requested from the user when he joined the experiment and was also part of the users public social profile during the experiment (together with his real name that we do not disclose here for privacy reasons).
    • format: csv: user_id;key;value

      The user_ids run from 1 to 76 (inclusive). Each user carries a single device that is identified by the same user_id. The 'key' is one of ['institute','city','country'] and the values are anonymized to simple integer ids.

    • sanitization: The values of key are anonymized to simple integer ids.

  • interests1: The initial interest groups of the participants. List of initial interest groups of the participants based on their Facebook groups and networks. The list contains also three pre-configured common groups for each participant (ids 1,2,3).
    • configuration: The MobiClique application adds three common interest groups for everybody (group_id=[1,2,3]). In addition, we use a simple Facebook desktop application to get the list of Facebook groups and networks for each participant to initialize his profile. While each participant logged on to their Facebook account for the initial configuration, they had the possibility to remove any details they wished before the application was run for the first time and the data was recorded in our trace file. Hence, the initial interest list does not necessarily contain the full list of Facebook groups of each participant.
    • format: csv: user_id;group_id
  • friends1: The initial friendship graph of the participants. List of friends of the participants based on their Facebook friends.
    • configuration: We use a simple Facebook desktop application to get the list of Facebook friends for each participant to initialize his profile. While each participant logged on to their Facebook account for the initial configuration, they had the possibility to remove any details they wished before the application was run for the first time and the data was recorded in our trace file. Hence, the initial friendship graph does not necessarily contain the full list of Facebook friends of each participant and the relationships may be asymmetric.
    • format: csv: user_id;friend_user_id
  • interests2: The evolution of interest groups. The MobiClique application lets users to discover and join existing interest groups, and create new interest groups at any time. Hence, the interest lists are changing over time.
    • configuration: The trace is based on the MobiClique application usage on each device.
    • format: csv: user_id;group_id;timestamp. The group_ids less than 1000 correspond to the initial interest groups and group_ids of 1000 and higher are new adhoc groups created during the experiment. The timestamp is the relative time in seconds since the start of the experiment,17/08/2009 08:00.
  • friends2: The evolution of the friendship graph. Similarly to the interest groups, the MobiClique application lets users to discover and friend other MobiClique users upon opportunistic encounters with them. Hence, the friendship graph is changing over time.
    • configuration: The trace is based on the MobiClique application usage on each device.
    • format: csv: user_id;friend_user_id;timestamp. The timestamp is the relative time in seconds since the start of the experiment, 17/08/2009 08:00.
  • activity: The activity periods of each participant and device. A device is active when it is collecting data. The inactivity periods occur due to batteries running out, at night time when the device is turned off, and due to some software problems.
    • configuration: The trace is calculated based on the periodic device discovery logs.
    • format: csv: user_id;start;end The start and end timestamps are the relative times in seconds since the start of the experiment, 17/08/2009 08:00.
  • proximity: The Bluetooth device discovery logs. The trace records all the nearby Bluetooth devices reported by the periodic Bluetooth device discoveries.
    • configuration: Each device performs a periodic Bluetooth device discovery every 120 +/- 10.24 seconds (randomized) for 10.24 seconds. Both the Bluetooth name and service queries are disabled to speed up the discovery process.
    • format: csv: timestamp;user_id;seen_user_id;seen_device_major_cod;seen_device_minor_cod

      The timestamp is the relative time in seconds since the start of the experiment, 17/08/2009 08:00. The user_ids below 100 are the experimental devices, user_ids of 100 or above are external Bluetooth devices seen during the experiment. The device_major_cod and device_minor_cod correspond to the device's standard Bluetooth Class of Device values. All MobiClique devices have hard coded device_minor_cod 128 for identification purposes. Note, that Bluetooth device discovery (and thus the trace) is asymmetric, i.e., a device A may see device B at some point in time but not the other way around.

  • messages: User-generated messages. This file lists the application level messages created by the users during the experiment. MobiClique allowed messaging between friends or among members of an interest group. In addition, MobiClique contained an epidemic voting application that allowed users to give rankings (1 to 5 stars) to the talks of the conference and see the real time results on their device.
    • configuration: The trace is based on the MobiClique application usage on each device.
       
    • format: csv: csv: msg_id;src_user_id;created;type;dst

      Each message has an unique msg_id (used in the message transmission traces). The messages are either unicast (type=U), multicast (type=M), or broadcast (type=B) messages. The destination ('dst' field) for unicast messages is another user. These messages are either delivered directly to the destination upon an encounter or forwarded to the friends of the the destination. The multicast messages are targeted to an interest group ('dst' is an interest group id). These messages are forwarded only to the members of the destination group. The broadcast messages (empty dst) are created by an epidemic voting application. As each vote requires only few bytes of data, the application aggregates them in a single broadcast message that contains all the votes given and/or received by the device at any time. For this reason, in the transmission log we do not detail the message id for the broadcast messages. The creation timestamp is the relative time in seconds since the start of the experiment, 17/08/2009 08:00. This is the time when the message is inserted to the MobiClique network queue, however, the actual sending takes place asynchronously upon encountering suitable target devices.

  • transmission: Message transmission logs. The data transmission protocol logs from the sender side. Data is transmitted between two devices using Bluetooth RFCOMM protocol on a fixed channel (no service discovery required).
    • configuration: The trace is based on the MobiClique application operation on each device.
    • format: csv: type;msg_id;bytes;hop_src_user_id;hop_dst_user_id;src_timestamp;status

      There are three types of protocol data units: 1=handshake (exchange of user profiles at contact start), 2=unicast or multicast message, 3=broadcast message. The handshake takes place upon every new encounter to exchange the social profiles and message Bloomfilters between the two nodes. The unicast and multicast msg_ids match the msg_id in the messages trace. The handshake and broadcast msg_ids are set to -1. The bytes field indicates the size of the message in bytes. The timestamp is the time at the sender and the status codes are 0 for success; 1 for failure. The timestamp is the relative time in seconds since the start of the experiment, 17/08/2009 08:00.

  • reception: Message reception logs. The data transmission protocol logs from the receiver side. Data is transmitted between two devices using Bluetooth RFCOMM protocol on a fixed channel (no service discovery required).
    • configuration: The trace is based on the MobiClique application operation on each device.
    • format: The format is identical to the transmission trace, except all the timestamps are on the receiving device's time.
Instructions: 

The files in this directory are a CRAWDAD dataset hosted by IEEE DataPort. 

About CRAWDAD: the Community Resource for Archiving Wireless Data At Dartmouth is a data resource for the research community interested in wireless networks and mobile computing. 

CRAWDAD was founded at Dartmouth College in 2004, led by Tristan Henderson, David Kotz, and Chris McDonald. CRAWDAD datasets are hosted by IEEE DataPort as of November 2022. 

Note: Please use the Data in an ethical and responsible way with the aim of doing no harm to any person or entity for the benefit of society at large. Please respect the privacy of any human subjects whose wireless-network activity is captured by the Data and comply with all applicable laws, including without limitation such applicable laws pertaining to the protection of personal information, security of data, and data breaches. Please do not apply, adapt or develop algorithms for the extraction of the true identity of users and other information of a personal nature, which might constitute personally identifiable information or protected health information under any such applicable laws. Do not publish or otherwise disclose to any other person or entity any information that constitutes personally identifiable information or protected health information under any such applicable laws derived from the Data through manual or automated techniques. 

Please acknowledge the source of the Data in any publications or presentations reporting use of this Data. 

Citation:

Anna-Kaisa Pietilainen, Christophe Diot, thlab/sigcomm2009, https://doi.org/10.15783/C70P42 , Date: 20120715

Dataset Files

LOGIN TO ACCESS DATASET FILES
Open Access dataset files are accessible to all logged in  users. Don't have a login?  Create a free IEEE account.  IEEE Membership is not required.

Documentation

AttachmentSize
File thlab-sigcomm2009-readme.txt1.58 KB

These datasets are part of Community Resource for Archiving Wireless Data (CRAWDAD). CRAWDAD began in 2004 at Dartmouth College as a place to share wireless network data with the research community. Its purpose was to enable access to data from real networks and real mobile users at a time when collecting such data was challenging and expensive. The archive has continued to grow since its inception, and starting in summer 2022 is being housed on IEEE DataPort.

Questions about CRAWDAD? See our CRAWDAD FAQ. Interested in submitting your dataset to the CRAWDAD collection? Get started, by submitting an Open Access Dataset.