CRAWDAD cambridge/haggle (v. 2006-01-31)

Citation Author(s):
James
Scott
Richard
Gass
Telefonica I+D
Jon
Crowcroft
University of Cambridge
Pan
Hui
University of Cambridge
Christophe
Diot
Paris Research Lab
Augustin
Chaintreau
Columbia University
Submitted by:
CRAWDAD Team
Last updated:
Thu, 11/09/2006 - 08:00
DOI:
10.15783/C77G6X
Data Format:
License:
235 Views
Citations:
1
Categories:
Keywords:
0
0 ratings - Please login to submit your rating.

Abstract 

Traces of Bluetooth sightings by groups of users carrying small devices (iMotes) for a number of days.

This data includes a number of traces of Bluetooth sightings by groups of users carrying small devices (iMotes) for a number of days - in office environments and conference environments.

All versions of this dataset, oldest to newest: v. 2006-01-31,  v. 2006-09-15,  v. 2009-05-29.

last modified :

2006-11-09

release date :

2006-01-31

date/time of measurement start :

2005-01-06

date/time of measurement end :

2005-03-10

collection environment :

Three iMote-based experiments were conducted. The first included eight researchers 
and interns working at Intel Research in Cambridge. The second obtained data
from twelve doctoral students and faculty comprising a research group at the 
University of Cambridge Computer Lab. The third experiment was conducted 
during the IEEE INFOCOM 2005 conference in Miami where 41 iMotes where carried 
by attendees for 3 to 4 days.

network configuration :

We set up experiments making use of the iMote platform made by Intel Research. 
iMotes are derived from the Berkeley Mote3, with the current version based around 
the Zeevo TC2001P system-on-a-chip providing an ARM7 processor and Bluetooth support. 
Along with a 950mAh CR2 battery, each iMote was enclosed in packaging designed 
to be convenient for test subjects to continually carry. Two types of packaging 
were made available : some iMotes were made into keyfobs while others were enclosed 
in small boxes. Subjects were asked to pick the form factor which allowed them 
to conveniently keep the iMote with them at all times, with most simply attaching 
the iMote to their keys.

data collection methodology :

iMotes contacts were classified into two groups: iMotes recording the sightings 
of another iMotes are classified as "internal" contacts, while sightings of 
other types of Bluetooth devices are called "external" contacts. The external 
contacts are numerous and include anyone who has an active Bluetooth device 
in the vicinity of the iMote carriers, thereby providing a measure of actual
wireless networking opportunities present at the time.  The internal contacts, 
on the other hand, represent the data transfer opportunities that each of 
our participants would have, if they were equipped with devices which
are always-on and always-carried.

sanitization :

An anonymised version of our data will be made available to other research 
groups on demand.

Traceset

cambridge/haggle/imote

Three traces of Bluetooth sightings by groups of users carrying small devices (iMotes) for a number of days.

  • files: imote-trace1.tar.gz, imote-trace2.tar.gz, imote-trace3.tar.gz
  • description: This traceset includes three traces of Bluetooth sightings by groups of users carrying small devices (iMotes) for a number of days - in Intel Research Cambridge Corporate Laboratory, Computer Lab at University of Cambridge, and Conference IEEE Infocom in Grand Hyatt Miami.
  • measurement purpose: User Mobility Characterization
  • methodology: We tried to keep the processing of data before public release to a minimum, to allow any flexibility for possible research use. Some choices had to be made to reduce power consumption, memory use, and because of specific capabilities of the iMote prototype. Before using these data for your research, it may be important to check that it does not impact any of your findings. 1- periodic desynchronized scanning. In all our experiments, iMotes were distributed to a group of people to collect any opportunistic sighting of other Bluetooth devices (including the other iMotes distributed). Each iMotes scans on a periodic basis for device, asking them to respond with their MAC address, via the paging function. It takes approximately 5 to 10s to perform the complete scanning. After initial test we observe that most of the contacts were recorded with 5s scaning time, and this value was ultimately chosen. The time granularity between two scanning is 120s. It is important to avoid synchronization of two iMotes around the same cycle clock, as each of them cannot respond to any request when it is actively scanning. We implemented a random dephasing on [-12s;+12s] to handle this case. 2- skip-length sequence. A contact "A sees B" is defined as a period of time where all successive scanning by A receive a positive answer by B. Ideally an information should be kept at the end of each contact period. After preliminary test it became quite clear that a very large number of contact periods were only separated by two intervals. We decided, to avoid memory overflow, to implement a skip sequence of "one", meaning that a contact period will only be stopped after two successive failure of a scanning response. As a consequence, no inter-contact time of less than two intervales could have been observed. 3- Manual Time synchronization. Time between iMotes is not synchronized by a central entity, and traces belonging to different devices bears time which are relative to the starting time of each device. To read all data with the same time axis, devices were started as much as possible at the same time, and a method based on mutual sightings were used to compute manually the shift between different traces. This will certainly prove to be quite accurate for interval of time above 5mn, we cannot claim a complete accuracy for smaller time-scale. And we recommend to compute mutual sightings to check any inaccuracies that we may incur in this data. The time is expressed in seconds, the origin ( 0s ) corresponds to 12am on the first day of the experiment. Hence time of the day can be computed from it. Again, the operation was to add a constant to all previously synchronized traces, to reflect the time of beginnning of the experiment. We cannot claim high accuracy (under 5mn).
  • sanitization: - Anonymization and Address Identifier. To protect participants privacy, we choose not to release the MAC address, neither from the iMotes nor from other external devices recorded. Every device is given a unique identifier, usually called ID number in this document. Depending on which number, it might be an iMote or another MAC address that were recorded from other active bluetooth devices around.
  • last modified: 2006-11-14
  • dataname: cambridge/haggle/imote
  • version: 20060131
  • change: the initial version
  • release date: 2006-01-31
  • date/time of measurement start: 2005-01-06
  • date/time of measurement end: 2005-03-10
  • hole: - Corrupted MAC address, and discarded mote. After the first couple of experiments, we observe that a number of MAC addresses recorded were different from a known one only by one or two digit. They were most of the time recorded once for a single time slot. It is clear that at least a part of them comes for a corrupted signal received on the link level by our devices. to ignore this artificial data, we implement the following rule: "Any MAC address that were recorded only once, for a single scanning (that is, related with a unique contact, with length 1s), are supposed defective and ignored." We did not discard any other one: a node that was seen twice, each contact being of length 1s, or a node that was seen once for two successive scanning, was included in the final datasets. Another important aspect is that some iMotes could not come up with data that can be used, mostly due to unfortunate hardware reset, or losses. These devices may still appear in the traces of other iMotes, and are difficult to interpret as they seems to follow an intermittent presence during the experiment. All of them were discarded from the final datasets, to avoid impacting the results in any way.

cambridge/haggle/imote Traces

    • intel: Traces of Bluetooth sightings by groups of users carrying small devices (iMotes) for six days in Intel Research Cambridge Lab.
  • configuration: ================================ Location: Intel Research Cambridge Corporate Laboratory Date: January 2005, Duration: Devices distributed on Thursday, January 6, at 11:30am Devices collected on Tuesday, January 11, in the afternoon (most of the traces last only for three days). ================================ Participants: 16 admin staff, researchers, interns, and admin staff. 1 iMote was left in the kitchen, as a stationary node, during the experiment. ================================ Collected datas: - Data from only 9 iMotes could be collected properly. The others suffered from too much reset. Addresses ID: ID 1 is the stationary node. ID 2-9 are corresponding to mobile iMotes ID 10-128 corresponds to external devices
  • format: ===== "table.Exp1.dat" is a file describing the contact where a certain device is seen. ======================== Examples taken from table.Exp1.dat (two first columns and first rows) ======================== ID # Class Incidence Occurence : Total ID 1 ID 2 Contact Time : 1 1 8 143 0 32 69951 0 4835 2 1 8 168 19 0 68818 1260 0 ======================== ======================== - The first column describes the ID of the device. - The second column takes value 1 or 2, it describes whether it is 1- an internal device (one of iMotes we distributed). 2- an external device (identified by his MAC address). We usually give smaller ID to internal nodes. That is the reason why all tables start with devices of class 1. - The third column describes the incidence of this device, namely the number of iMote that recorded its MAC address during this experiment. It is usually between 1 and n for an external device (where n is the number of iMotes deployed), and between 1 and n-1 for an internal device. - The rest of the table describes the number of contacts (first line) where this device were seen, and the cumulated time of these contacts (second line). Columns correspond to which iMotes recorded this devices. From the example above, node with ID 1 was seen in total 143 time during Experiment 1, and it was seen 32 time by node with ID 2. The cumulated time where 2 saw 1 is 4835 s. Node 2 was seen 168 time in total, and 19 time by node 1, the total time it saw node 1 is 1260. Note that, as we usually observe, this number may not be symmetric, as interference and the limit of our implementation can create non-mutual sightings. They are, however, usually of the same order. ===== "MAC3Btable.Exp1.dat" is a file that contains the three first bytes of the MAC address, associated with each ID. It could be useful to identify what is the kind of each external device. ===== "contacts.Exp1.dat" is a file which describes the contact that were recorded by all devices we distributed during this experiment. ======================== Examples taken from table.Exp1.dat (two first columns and first rows) ======================== 1 8 121 121 1 0 1 3 236 347 1 0 1 4 236 347 1 0 1 5 121 464 1 0 1 8 585 585 2 464 ======================== ======================== - The first column gives the ID of the device who recorded the sightings. - The second column gives the ID of the device who was seen (it may be an iMote, or another device recorded during the experiment). - The third and fourth column describe, respectively, the first and last time when the address of ID2 were recorded by ID1 for this contact. - The fifth and sixth column are here for reading convenience. The fifth enumerate contacts with same ID1 and ID2, as 1,2,... . The last column describes the time difference between the beginning of this contact and the end of the previous contact with same ID1 and ID2. It is by convention set to 0 if this is the first contact for this ID1 and ID2. - Note, again, that these contacts may not be mutual between a pair of iMotes, because scanning period of different iMotes are not synchronized, and because the sightings might not be symmetric.
  • description: This trace includes Bluetooth sightings by groups of users carrying small devices (iMotes) for six days in Intel Research Cambridge Corporate Laboratory.
  • last modified: 2006-11-14
  • dataname: cambridge/haggle/imote/intel
  • version: 20060131
  • change: the initial version
  • release date: 2006-01-31
  • date/time of measurement start: 2005-01-06
  • date/time of measurement end: 2005-01-11
  • url: /download/cambridge/haggle/imote-trace1.tar.gz
    • cambridge: Trace of Bluetooth sightings by groups of users carrying small devices (iMotes) for six days in Camputer Lab at University of Cambridge.
  • configuration: Location: Computer Lab, University of Cambridge Date: End of January 2005 Duration: Devices distributed on Tuesday, January 25th, 2005 at 14:00am Devices collected on Monday, January 31st, 2005 in the afternoon (most of the iMotes last around 5days) Participants: 19 graduate students from the System Research Group. Collected datas: - Some of the iMotes did not deliver any useful data, as a consequence of accidental hardware reset. Contacts with one of them were discarded from the traces of other iMotes to avoid any consequence on the experimental results. - In total only 12 iMotes could be used to produce this trace, others were suffering from hardward resets. The contacts with these nodes were discarded from the complete - Details of ID number: ID 1-12 are corresponding to iMotes (Class 1) ID 13-223 corresponds to external devices (Class 2)
  • format: ===== "table.Exp2.dat" is a file describing the contact where a certain device is seen. ======================== Examples taken from table.Exp2.dat (two first columns and first rows) ======================== ID # Class Incidence Occurence : Total ID 1 ID 2 Contact Time : 1 1 8 143 0 32 69951 0 4835 2 1 8 168 19 0 68818 1260 0 ======================== ======================== - The first column describes the ID of the device. - The second column takes value 1 or 2, it describes whether it is 1- an internal device (one of iMotes we distributed). 2- an external device (identified by his MAC address). We usually give smaller ID to internal nodes. That is the reason why all tables start with devices of class 1. - The third column describes the incidence of this device, namely the number of iMote that recorded its MAC address during this experiment. It is usually between 1 and n for an external device (where n is the number of iMotes deployed), and between 1 and n-1 for an internal device. - The rest of the table describes the number of contacts (first line) where this device were seen, and the cumulated time of these contacts (second line). Columns correspond to which iMotes recorded this devices. From the example above, node with ID 1 was seen in total 143 time during Experiment 1, and it was seen 32 time by node with ID 2. The cumulated time where 2 saw 1 is 4835 s. Node 2 was seen 168 time in total, and 19 time by node 1, the total time it saw node 1 is 1260. Note that, as we usually observe, this number may not be symmetric, as interference and the limit of our implementation can create non-mutual sightings. They are, however, usually of the same order. ===== "MAC3Btable.Exp2.dat" is a file that contains the three first bytes of the MAC address, associated with each ID. It could be useful to identify what is the kind of each external device. ===== "contacts.Exp2.dat" is a file which describes the contact that were recorded by all devices we distributed during this experiment. ======================== Examples taken from table.Exp2.dat (two first columns and first rows) ======================== 1 8 121 121 1 0 1 3 236 347 1 0 1 4 236 347 1 0 1 5 121 464 1 0 1 8 585 585 2 464 ======================== ======================== - The first column gives the ID of the device who recorded the sightings. - The second column gives the ID of the device who was seen (it may be an iMote, or another device recorded during the experiment). - The third and fourth column describe, respectively, the first and last time when the address of ID2 were recorded by ID1 for this contact. - The fifth and sixth column are here for reading convenience. The fifth enumerate contacts with same ID1 and ID2, as 1,2,... . The last column describes the time difference between the beginning of this contact and the end of the previous contact with same ID1 and ID2. It is by convention set to 0 if this is the first contact for this ID1 and ID2. - Note, again, that these contacts may not be mutual between a pair of iMotes, because scanning period of different iMotes are not synchronized, and because the sightings might not be symmetric.
  • description: This trace includes Bluetooth sightings by groups of users carrying small devices (iMotes) for six days in Computer Lab at University of Cambridge.
  • last modified: 2006-11-14
  • dataname: cambridge/haggle/imote/cambridge
  • version: 20060131
  • change: the initial version
  • release date: 2006-01-31
  • date/time of measurement start: 2005-01-25
  • date/time of measurement end: 2005-01-31
  • url: /download/cambridge/haggle/imote-trace2.tar.gz
    • infocom: Trace of Bluetooth sightings by groups of users carrying small devices (iMotes) for four days.
  • configuration: Location: Conference IEEE Infocom in Grand Hyatt Miami Date: March 2005 Duration: Devices distributed on March 7th, 2005 between lunch time and 5pm. Devices collected on March 10th, 2005 in the afternoon. Participants: 50 students, attending the student workshop. Collected datas: - 2 iMotes were lost, and 7 did not deliver useful data, as a consequence of accidental hardware reset. Contacts with any of these were discarded from the traces of other iMotes to avoid any consequence on the experimental results. - The first six hours were discarded, as people were attending the same workshop during the first afternoon. - Details of ID number: ID 1-41 are corresponding to iMotes (Class 1) ID 42-274 corresponds to external devices (Class 2)
  • format: ===== "table.Exp3.dat" is a file describing the contact where a certain device is seen. ======================== Examples taken from table.Exp3.dat (two first columns and first rows) ======================== ID # Class Incidence Occurence : Total ID 1 ID 2 Contact Time : 1 1 8 143 0 32 69951 0 4835 2 1 8 168 19 0 68818 1260 0 ======================== ======================== - The first column describes the ID of the device. - The second column takes value 1 or 2, it describes whether it is 1- an internal device (one of iMotes we distributed). 2- an external device (identified by his MAC address). We usually give smaller ID to internal nodes. That is the reason why all tables start with devices of class 1. - The third column describes the incidence of this device, namely the number of iMote that recorded its MAC address during this experiment. It is usually between 1 and n for an external device (where n is the number of iMotes deployed), and between 1 and n-1 for an internal device. - The rest of the table describes the number of contacts (first line) where this device were seen, and the cumulated time of these contacts (second line). Columns correspond to which iMotes recorded this devices. From the example above, node with ID 1 was seen in total 143 time during Experiment 1, and it was seen 32 time by node with ID 2. The cumulated time where 2 saw 1 is 4835 s. Node 2 was seen 168 time in total, and 19 time by node 1, the total time it saw node 1 is 1260. Note that, as we usually observe, this number may not be symmetric, as interference and the limit of our implementation can create non-mutual sightings. They are, however, usually of the same order. ===== "MAC3Btable.Exp3.dat" is a file that contains the three first bytes of the MAC address, associated with each ID. It could be useful to identify what is the kind of each external device. ===== "contacts.Exp3.dat" is a file which describes the contact that were recorded by all devices we distributed during this experiment. ======================== Examples taken from table.Exp3.dat (two first columns and first rows) ======================== 1 8 121 121 1 0 1 3 236 347 1 0 1 4 236 347 1 0 1 5 121 464 1 0 1 8 585 585 2 464 ======================== ======================== - The first column gives the ID of the device who recorded the sightings. - The second column gives the ID of the device who was seen (it may be an iMote, or another device recorded during the experiment). - The third and fourth column describe, respectively, the first and last time when the address of ID2 were recorded by ID1 for this contact. - The fifth and sixth column are here for reading convenience. The fifth enumerate contacts with same ID1 and ID2, as 1,2,... . The last column describes the time difference between the beginning of this contact and the end of the previous contact with same ID1 and ID2. It is by convention set to 0 if this is the first contact for this ID1 and ID2. - Note, again, that these contacts may not be mutual between a pair of iMotes, because scanning period of different iMotes are not synchronized, and because the sightings might not be symmetric.
  • description: This trace includes Bluetooth sightings by groups of users carrying small devices (iMotes) for four days in Conference IEEE Infocom in Grand Hyatt Miami.
  • last modified: 2006-11-14
  • dataname: cambridge/haggle/imote/infocom
  • version: 20060131
  • change: the initial version
  • release date: 2006-01-31
  • date/time of measurement start: 2005-03-07
  • date/time of measurement end: 2005-03-10
  • url: /download/cambridge/haggle/imote-trace3.tar.gz
  • hole: Of the fifty-four iMotes distributed, forty-one yielded useful data, eleven did not contain useful data because of various failures with the battery and packaging, and two were not returned.
  • limitation: Preliminary tests revealed the following problem: Bluetooth devices on a specific brand of mobile phone did not show up consistently during inquiries (and increasing the inquiry period to ten seconds did not help). Therefore, a small number of nodes were causing the memory to fill too quickly. To avoid this problem, we keep a device in the "in-contact list" even if it is not seen for one inquiry interval. If it comes back in-contact on the next interval, nothing is stored. If it does not, a record is stored as normal. This solves the problem, at the expense of not being able to detect actual cases where a node moved out of range during one two-minute period, and back into range for the next two-minute period.
Instructions: 

The files in this directory are a CRAWDAD dataset hosted by IEEE DataPort. 

About CRAWDAD: the Community Resource for Archiving Wireless Data At Dartmouth is a data resource for the research community interested in wireless networks and mobile computing. 

CRAWDAD was founded at Dartmouth College in 2004, led by Tristan Henderson, David Kotz, and Chris McDonald. CRAWDAD datasets are hosted by IEEE DataPort as of November 2022. 

Note: Please use the Data in an ethical and responsible way with the aim of doing no harm to any person or entity for the benefit of society at large. Please respect the privacy of any human subjects whose wireless-network activity is captured by the Data and comply with all applicable laws, including without limitation such applicable laws pertaining to the protection of personal information, security of data, and data breaches. Please do not apply, adapt or develop algorithms for the extraction of the true identity of users and other information of a personal nature, which might constitute personally identifiable information or protected health information under any such applicable laws. Do not publish or otherwise disclose to any other person or entity any information that constitutes personally identifiable information or protected health information under any such applicable laws derived from the Data through manual or automated techniques. 

 

Please acknowledge the source of the Data in any publications or presentations reporting use of this Data. 

Citation:

James Scott, Richard Gass, Jon Crowcroft, Pan Hui, Christophe Diot, Augustin Chaintreau, cambridge/haggle, https://doi.org/10.15783/C70011 , Date: 20090529

 

Dataset Files

These datasets are part of Community Resource for Archiving Wireless Data (CRAWDAD). CRAWDAD began in 2004 at Dartmouth College as a place to share wireless network data with the research community. Its purpose was to enable access to data from real networks and real mobile users at a time when collecting such data was challenging and expensive. The archive has continued to grow since its inception, and starting in summer 2022 is being housed on IEEE DataPort.

Questions about CRAWDAD? See our CRAWDAD FAQ. Interested in submitting your dataset to the CRAWDAD collection? Get started, by submitting an Open Access Dataset.