CRAWDAD strath/nodobo

Citation Author(s):
Alisdair
McDiarmid
James
Irvine
University of Strathclyde
Stephen
Bell
Jamie
Banford
University of Strathclyde
Submitted by:
CRAWDAD Team
Last updated:
Tue, 07/05/2011 - 08:00
DOI:
10.15783/C7HP4Q
Data Format:
License:
301 Views
Categories:
Keywords:
0
0 ratings - Please login to submit your rating.

Abstract 

Dataset of mobile phone usage records collected with Nodobo suite at the University of Strathclyde.

Dataset gathered by Nodobo, a suite of social sensor software for Android phones, during a study of the mobile phone usage at University of Strathclyde.

date/time of measurement start: 2010-09-09 

date/time of measurement end: 2011-02-23 

collection environment: Our researchers developed "Nodobo", a set of software extensions to the Google Android operating system, for enabling the capture and replay of smartphone user interactions sessions. The software captures a variety of social context data, including logs of phone calls, text messages, Bluetooth proximity detection, WiFi access point, and cell tower ID. The directionality of calls and text messages are recorded, along with the associated phone number, and the duration of the call or length of the message. Bluetooth proximity is detected every minute, and includes all devices in the study as well as any other clients which respond to service discovery. Basic positioning is achieved through WiFi hotspot and cell tower ID records. 

network configuration: Each of the study participants was given a Google Nexus One smartphone, prepared with a modified Android operating system. Data is stored in a simple database on the device SD card, which is then synchronised over the air to a central server. 

data collection methodology: The dataset was collected through monitoring devices of 27 users over a 5-month study. 

sanitization: Record fields containing personally identifiable information have been anonymised.  

Nodobo-2011-01-v1 is the traceset gathered by Nodobo software at University

of Strathclyde from September 2010 to February 2011.

Traceset

strath/nodobo/mobile

Traceset of mobile phone usage records collected with Nodobo suite at the University of Strathclyde.

  • file: nodobo-release.tar.gz, nodobo-csv.zip
  • description: Nodobo-2011-01-v1 is the traceset gathered by Nodobo software at University of Strathclyde from September 2010 to February 2011.
  • measurement purpose: Usage Characterization, Social Network Analysis
  • methodology: A group of 27 promising high school students in a Scottish state high school were selected for this study. All students previously had a mobile phone, with approximately 1/3 of these falling in the category of smartphone (iPhone, Blackberry, or similarly powerful handset). Each of the study participants was given a Google Nexus One smartphone, prepared with a modified Android operating system.

    The close proximity of the deployment to University of Strathclyde enables the study organisers to schedule regular visits to diagnose issues, as well as facilitating regular backups to be made. To maintain as up-to-date a dataset as possible, and to limit the number of visits required, the devices also synchronise with a web server over the mobile network or WiFi.

  • sanitization: Record fields containing personally identifiable information have been anonymised.  

trath/nodobo/mobile Trace 

  • social: Mobile phone usage records collected with Nodobo suite at University of Strathclyde in 2010-2011.
    • configuration: 27 Google Nexus One smartphones were prepared with a modified Android operating system, running Nodobo. The phone database is synchronised periodically over-the-air with a web services data store. 
    • format:  db.sqlite3.dump.bz2 is a bzipped SQL dump of the sqlite3 database. You can recreate the database by doing the following: 

bzcat db.sqlite3.dump.bz2 | sqlite3 db.sqlite3 

# Database schema

The following tables are used:

## Calls and Messages

* other_id: id of the other user on the call (NULL if not in the study)

* number: phone number of the other end of the call/message (related: 

Users#number)

* duration: length of the call in seconds

* length: number of characters in the message

## CellTowers

* cellid: GSM base transceiver station CID

* lac: location area code

## Devices

* imei: blank for this release of the data

* mac: Bluetooth MAC (related: Presences#mac)

## Presences

* other_id: user_id of the detected device (NULL if not in the study)

* mac: Bluetooth MAC (related: Devices#mac)

* bluetooth_class: reported class of the device

* name: human-readable name of the device

## Users

* name: "Anonymous" for this release of the data

* number: phone number of the study user (related: Calls#number, 

Messages#number)

## Wifis

* ssid: human-readable name of the base station

* bssid: base station MAC

## All tables

* The database schema follows ActiveRecord conventions: tables are plurals, 

foreign keys are singular_id, each table has an id primary key and 

created_at/updated_at timestamps.

* user_id is used to indicate which user recorded the interaction.

* Calls and messages tables have two timestamp columns. The 

call_timestamp/message_timestamp is the one recorded by the phone when the 

call/message was originally recorded. The timestamp column in the time at 

which the calldb/smsdb synchronisation occurred (which is less useful).

* Some tables have an "interaction" column. This was used for database 

synchronising and is left in for internal debugging purposes.

# Software and studies

Also included in the dataset download are programs for three sample studies. These are detailed below.

Each program can be run with ruby: for example, "ruby conversation-length.rb". The programs assume that your current working directory is the one with the database and the nodobo.rb code.

Software used:

* Ruby 1.8.7 or later, with gems: activerecord, sqlite3-ruby, progressbar

* gnuplot 4.4

* GraphViz 2.22

## Ruby interface: nodobo.rb

We have supplied a simple ActiveRecord interface to the database, "nodobo.rb". This gives classes and relations for each of the types of data in the dataset.

The interface can be used by running "irb -r ./nodobo.rb", or by using "require 'nodobo'" in your own programs. A sample irb session is given below:

>> u = User.find(19)

=> #

>> u.calls.size

=> 976

>> study_calls = u.calls.select {|c| c.other != nil }; study_calls.size

=> 133

>> Hash[study_calls.group_by(&:other_id).map {|k,v| [k, v.size]}]

=> {16=>2, 19=>1, 25=>2, 14=>4, 21=>124}

>> v = User.find(21)

=> #

>> v.calls.select {|c| c.other != nil }.size

=> 175

    • sanitization: The following fields have been altered to remove personal information from the

dataset:                                                                       

  * Call#number, Message#number, User#number                                  

  * Device#mac, Presence#mac                                                 

  * Wifi#bssid                                                                 

  * Presence#name                                                              

  * Wifi#ssid                                                                  

  * CellTower#cellid                                                           

  * CellTower#lac                                                                                                         

Each real value for these fields maps 1:1 to a randomly-generated anonymous 

value. The process for generating these values is as follows:                                                                                 

  * Phone number: random number with the same number of digits; if original 

    number is 3 or more digits, keep the original first 2 digits                    

  * MAC address: 12 random hex digits                                             

  * Bluetooth name/Wifi ssid: random sequence of dictionary words, same number 

    of words as original name                                                  

  * Cell ID and LAC: random number with the same number of digits 

Instructions: 

The files in this directory are a CRAWDAD dataset hosted by IEEE DataPort. 

About CRAWDAD: the Community Resource for Archiving Wireless Data At Dartmouth is a data resource for the research community interested in wireless networks and mobile computing. 

CRAWDAD was founded at Dartmouth College in 2004, led by Tristan Henderson, David Kotz, and Chris McDonald. CRAWDAD datasets are hosted by IEEE DataPort as of November 2022. 

Note: Please use the Data in an ethical and responsible way with the aim of doing no harm to any person or entity for the benefit of society at large. Please respect the privacy of any human subjects whose wireless-network activity is captured by the Data and comply with all applicable laws, including without limitation such applicable laws pertaining to the protection of personal information, security of data, and data breaches. Please do not apply, adapt or develop algorithms for the extraction of the true identity of users and other information of a personal nature, which might constitute personally identifiable information or protected health information under any such applicable laws. Do not publish or otherwise disclose to any other person or entity any information that constitutes personally identifiable information or protected health information under any such applicable laws derived from the Data through manual or automated techniques. 

Please acknowledge the source of the Data in any publications or presentations reporting use of this Data. 

Citation:

Alisdair McDiarmid, James Irvine, Stephen Bell, Jamie Banford, strath/nodobo, https://doi.org/10.15783/C7HP4Q , Date: 20110323

Copyright (c) 2011 University of Strathclyde

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights

to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR

IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,

FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE

AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER

LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,

OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN

THE SOFTWARE.

Dataset Files

LOGIN TO ACCESS DATASET FILES
Open Access dataset files are accessible to all logged in  users. Don't have a login?  Create a free IEEE account.  IEEE Membership is not required.

Documentation

AttachmentSize
File strath-nodobo-readme.txt1.6 KB

These datasets are part of Community Resource for Archiving Wireless Data (CRAWDAD). CRAWDAD began in 2004 at Dartmouth College as a place to share wireless network data with the research community. Its purpose was to enable access to data from real networks and real mobile users at a time when collecting such data was challenging and expensive. The archive has continued to grow since its inception, and starting in summer 2022 is being housed on IEEE DataPort.

Questions about CRAWDAD? See our CRAWDAD FAQ. Interested in submitting your dataset to the CRAWDAD collection? Get started, by submitting an Open Access Dataset.