CRAWDAD tools/analyze/802.11/Wit (v. 2006-09-29)

Citation Author(s):
Ratul
Mahajan
Maya
Rodrig
John
Zahorjan
Submitted by:
CRAWDAD Team
Last updated:
Thu, 11/09/2006 - 08:00
DOI:
10.15783/C7DG6G
Data Format:
License:
152 Views
Categories:
Keywords:
0
0 ratings - Please login to submit your rating.

Abstract 

A tool to analyze wireless MAC.

Wit is a non-intrusive tool that builds on passive monitoring to analyze the detailed MAC-level behavior of operational wireless networks.

Lastmodified :

2006-11-09

Dataname :

tools/analyze/802.11/Wit

File :

wit-v20060929.tar.gz, README-v20060929

Releasedate :

2006-09-29

References :

mahajan-wit
README

Website :

http://www.cs.washington.edu/research/networking/wireless/index.html

Keyword :

RFMON
802.11 frames
802.11
packet trace
tcpdump

License :

It can be used and distributed under the following terms:
- the software is provided as is, with no warranties or liabilities
- research and non-commercial use is permitted
- contact the authors for commercial use
- the original attribution should exist on any derivative work
- the derivative works should be distributed under similar terms

Support :

these scripts are provided as is. they may cause irreversible damage
to your psyche; use them at your own risk.

they were written hastily and not with a view to sharing with others.
as a result, they are ugly, sloppy, and have many idiosyncracies. we
have tried to flag these in this README file but you are bound to run
into issues.

if problems arise, first please carefully look at the source and try
to understand what is going on. contact us if things are still
unclear.

we are very interested in hearing about your experience with them.
bugfixes and suggestions for improvement are also appreciated.

Build :

the included scripts use perl, the DBI module of perl and mysql (other
databases supported by DBI should also work).  install them first. you
should be at least a little familiar with sql.

Output :

See "usage" for details about the output of each tool.

Parameters :

See "usage" for details about the parameters needed for each tool.

Usage :

A. inserting traces into the database
-------------------------------------

([NOTE] only the default behavior of the scripts is explained below.
their behavior can be modified using command line arguments.  look at
the scripts for the various options. using the -h flag or incorrect
usage will print out a usage message for most scripts.)


1. create the database where you want to store the tables. you can do
this with mysqladmin as:
% mysqladmin create [dbname]
where [dbname] is the name of the database.

2. set up your environment to use the database:
export WDB_USER=[username];    # the username to use for accessing the DB
export WDB_PASSWD=[password];  # the password
export WDB_HOST=[hostname];    # the host where mysql is installed
export WDB_DB=[dbname]         # the database name

to avoid setting up this environment each time, insert these into your
shell's .rc file (e.g., .bashrc), or modify localutils.pl which
contains the defaults to use when the above environment variables are
not defined.

3. run createIndexTablesTable.pl to create some helper index tables in
the database.

4. populate the database with the monitor traces. the format of these
tables is:


mysql> show columns from raw_east_i0_c3;
+------------------+----------------------+------+-----+---------+-------+
| Field            | Type                 | Null | Key | Default | Extra |
+------------------+----------------------+------+-----+---------+-------+
| id               | int(11)              | NO   | PRI |         |       |
| aFrameTime       | decimal(18,7)        | YES  |     | NULL    |       |
| aTimeDelta       | int(10) unsigned     | YES  |     | NULL    |       |
| bChannel         | tinyint(3) unsigned  | YES  |     | NULL    |       |
| bHostTime        | int(10) unsigned     | YES  |     | NULL    |       |
| bMACTime         | int(10) unsigned     | YES  |     | NULL    |       |
| bRate            | float                | YES  |     | NULL    |       |
| bRssi            | smallint(6)          | YES  |     | NULL    |       |
| bSize            | smallint(5) unsigned | YES  |     | NULL    |       |
| cA1_Dest         | smallint(5) unsigned | YES  |     | NULL    |       |
| cA2_Src          | smallint(5) unsigned | YES  |     | NULL    |       |
| cA3              | smallint(5) unsigned | YES  |     | NULL    |       |
| cA4              | smallint(5) unsigned | YES  |     | NULL    |       |
| cDSFrom          | tinyint(3) unsigned  | YES  |     | NULL    |       |
| cDSTo            | tinyint(3) unsigned  | YES  |     | NULL    |       |
| cDur             | smallint(5) unsigned | YES  |     | NULL    |       |
| cFragno          | tinyint(3) unsigned  | YES  |     | NULL    |       |
| cMoredata        | tinyint(3) unsigned  | YES  |     | NULL    |       |
| cMorefrag        | tinyint(3) unsigned  | YES  |     | NULL    |       |
| cOrdered         | tinyint(3) unsigned  | YES  |     | NULL    |       |
| cProtver         | tinyint(3) unsigned  | YES  |     | NULL    |       |
| cPwrmgt          | tinyint(3) unsigned  | YES  |     | NULL    |       |
| cRetry           | tinyint(3) unsigned  | YES  |     | NULL    |       |
| cSeqno           | smallint(5) unsigned | YES  |     | NULL    |       |
| cSubtype         | tinyint(3) unsigned  | YES  |     | NULL    |       |
| cType            | tinyint(3) unsigned  | YES  |     | NULL    |       |
| cWEP             | tinyint(3) unsigned  | YES  |     | NULL    |       |
| dBeaconTimestamp | bigint(20) unsigned  | YES  |     | NULL    |       |
| zErr             | tinyint(3) unsigned  | YES  |     | NULL    |       |
| zErrDriver       | tinyint(3) unsigned  | YES  |     | NULL    |       |
| zErrPhy          | tinyint(3) unsigned  | YES  |     | NULL    |       |
+------------------+----------------------+------+-----+---------+-------+
31 rows in set (0.03 sec)


the meanings of the most fields should be apparent (ignore the first
letter of the fieldnames) except for maybe the following:
- zErr* fields correspond to frames that were not correctly
decoded
- aFrameTime is the absolute time of the frame in seconds. this is
persistent but may not be very precise.
- aTimeDelta and bHostTime: ignore these fields
- bMACTime is the hardware timestamp of the frame in microseconds.
it is 32 bit wide and hence rolls over in rougly 1.2 hours.

not all fields are meaningful for all captured frames.

to save disk space (and to speed processing), the MAC address fields
are stored as indices into a corresponding MACIndex table. the name of
the mac index table should be [raw_table_name]_MACIndex. its format
is:

mysql> show columns from raw_east_i0_c3_MACIndex;
+-------+------------------+------+-----+---------+----------------+
| Field | Type             | Null | Key | Default | Extra          |
+-------+------------------+------+-----+---------+----------------+
| id    | int(10) unsigned | NO   | PRI | NULL    | auto_increment |
| value | char(18)         | YES  |     | NULL    |                |
+-------+------------------+------+-----+---------+----------------+
2 rows in set (0.00 sec)


we have included a script called createRawDataDatabase.pl that creates
such tables from tcpdump, prism header logs such as those collected by
us at sigcomm04. this script is run as:
% createRawDataDatabase.pl -d [dir] [logger] [interface] [channel]
where:
[dir] is the directory containing the traces

the script assumes that the trace filenames are of the format:
[dir]/[logger]/*-ath[interface].tcpdump*;
if your filenames are different format, modify the following line in the script:
my @fileList = split(' ', `ls $rootDir/$logger/*-ath$interface.tcpdump*`);

the script uses tethereal to read the traces; you need to point it at
the location of tethereal by modifying the following line in the script:
my $TETHEREAL = "$ENV{HOME}/usr/local/bin/tethereal";

to insert traces of a different format into the database, you need to
change the processPackets subroutine in createRawDataDatabase.pl.

Usage :

B. merging (halfWit)
--------------------

([NOTE] only the default behavior of the scripts is explained below.
their behavior can be modified using command line arguments.  look at
the scripts for the various options. using the -h flag or incorrect
usage will print out a usage message for most scripts.)

merging happens in multiple, waterfall rounds. two tables are merged
in round. the first table can be a result of merging in the previous
rounds; the second table is a raw table. a round consists of two
steps:
a. createBeaconIndex.pl creates a temporary table that contains all
the beacons that are common to the two tables. be sure to ignore
beacons from APs that do not have monotonically increasing beacon
timestamps.

b. createMergedTable.pl uses the beacons table to merge the two
tables.

an example of the usage of these two scripts is shown in
merge-commands.sh. that shell script shows how to merge the following
five tables:
raw_chi_i0_c1
raw_sah_i0_c1
raw_son_i0_c1
raw_kal_i0_c1
raw_moj_i0_c1

createBeaconIndex.pl takes as input the names of the two tables being
merged and the desired name for the beacons table, which should be of
the form beacons_*.

createMergedTable.pl takes as input the name of the beacons
table. from beacons_*, it produces the merged table called merged_*.

for us, merging the traces of kal(ahari) was a little bit more
problematic because that monitor was being shut down at nights.
merge-commands.sh also shows how such monitors can be merged.

the merged tables contain more columns than the raw tables. the number
of additional columns depends on how many raw tables have been merged
so far. the extra columns in a merge of three raw tables will have the
following additional columns:

| zzFromTable      | tinyint(3) unsigned  | YES  |     | NULL    |                |
| zzid_0           | int(11)              | YES  |     | NULL    |                |
| zzbRssi_0        | smallint(6)          | YES  |     | NULL    |                |
| zzTDiff_0        | tinyint(3) unsigned  | YES  |     | NULL    |                |
| zzid_1           | int(11)              | YES  |     | NULL    |                |
| zzbRssi_1        | smallint(6)          | YES  |     | NULL    |                |
| zzTDiff_1        | tinyint(3) unsigned  | YES  |     | NULL    |                |
| zzid_2           | int(11)              | YES  |     | NULL    |                |
| zzbRssi_2        | smallint(6)          | YES  |     | NULL    |                |
| zzTDiff_2        | tinyint(3) unsigned  | YES  |     | NULL    |                |
+------------------+----------------------+------+-----+---------+----------------+
- zzFromTable: the input table from which this frame came from (any
one if it was present in multiple tables).

- zzid_[k]: the corresponding id of this frame in the k-th raw
table. it is null if this frame was not present in k-th raw table.

- zzbRssi_[k]: similar to the above.

- zzTDiff_[k]: the time difference in translated terms between the
timestamp in the k-th raw table and the timestamp assigned to the
merged frame. (this can be used to gauge the quality of time
synchronization.)

Usage :

C. inference (nitWit)
--------------------

([NOTE] only the default behavior of the scripts is explained below.
their behavior can be modified using command line arguments.  look at
the scripts for the various options. using the -h flag or incorrect
usage will print out a usage message for most scripts.)

lostPackets.pl implements the functionality of nitWit. it can be run
simply as:
% ./lostPackets.pl [tablename]
where [tablename] can either be a raw or a merged table.

this scripts mainly produces three tables -- processed, extras, and
synthpkts.  apart from some columns that it shares with the raw and
merged tables, the processed table contains the following columns:

| _zBSSID    | smallint(5) unsigned | YES  |     | NULL    |       |
| _zDir      | tinyint(3) unsigned  | YES  |     | NULL    |       |
| _zWasrcvd  | tinyint(3) unsigned  | YES  |     | NULL    |       |
+------------+----------------------+------+-----+---------+-------+

- _zBSSID is the inferred BSSID of the packet. not all 802.11
packets contain the BSSID field in their headers. inserting this here
simplifies per packet processing. similarly, the cA2_Src is filled out
properly with the inferred source of the packet if its absent (e.g.,
for ACKs).

- _zDir represents the direction of the packet. the meanings of the
values are contained in the dirIndex table created by
createIndexTablesTable.pl.

mysql> select * from dirIndex;
+----+--------+
| id | value  |
+----+--------+
|  0 | up     |
|  1 | down   |
|  2 | ad hoc |
|  3 | ap-ap  |
+----+--------+
4 rows in set (0.02 sec)

- _zWasrcvd represents whether the frame was received by its
destination.

the extras table contains some auxillary information for the frames.
it is separate from the processed table to reduce the processing time
of the scripts that operate on the processed table, which is
restricted to commonly used fields.

the synthpkts table contains the inferred packets.

for an input table called raw_[details] or merged_[details], by
default, the three tables will be called processed_[details],
processed_[details]_extras, processed_[details]_SynthPkts.


lostPackets.pl takes a bunch of options. (for details, see the script
or use the -h flag.) a couple of important ones are:
-e regexpFile : the file containing the regular expression
-w wtfile : the file containing symbol weights

the script also uses the perl Parse::RecDescent package. it is
included for convenience.

Usage :

D. contenders (dimWit)
---------------------

([NOTE] only the default behavior of the scripts is explained below.
their behavior can be modified using command line arguments.  look at
the scripts for the various options. using the -h flag or incorrect
usage will print out a usage message for most scripts.)

computeContention2.pl estimates the number of contenders at the
instant when each packet was sent. run is simply as:
% computeContention2.pl [tablename]
it take a processed table (see above) as input.

to simplify the computation of aggregate statistics as function of
#contenders, it summarizes its results in two main tables. given
processed_[details], the two tables will be called
contention_[details]_Time and contention_[details]_Packet.
- the _Time table summarizes how a client spends its time -- whether
it was idle, deferring, busy, etc. -- at each contention level.
- the _Packet table summarizes the details of clients' packets --
their rate, direction, reception status, etc. -- at each contention
level.

look at the columns of the two tables (and the source) to get a better
sense of what's in them.

Example :

we have provided a toy database as an example to help you get started.
the toy db contains 3 raw tables from our simulator experiments. the
script wit_toydb.sh contains commands to run halfwit, nitwit, and
dimwit over them.

1. assuming that you have already installed mysql, set the environment
to use it:
% export WDB_HOST=[host]
% export WDB_USER=[username]
% export WDB_PASSWD=[password]
(the script below will automatically set the WDB_DB variable)

2. simply run the provided shell script which should do the following
% ./wit_toydb.sh

this will generate a bunch of output and warnings. if everything is
working fine, your output should look like that in wit_toydb.log. (you
could pipe the output of wit_toydb.sh to a file and compare it with
the provided wit_toydb.log.)

3. by now, you should have a few logs in your working directory and
more tables in your database. to look at the tables:
% mysql -h [host] -u [username] -p[password] wit_toydb
this should bring up the mysql shell, at which point, do:
mysql> show tables;
the output of this command should display:

+------------------------------------------+
| Tables_in_wit_toydb                      |
+------------------------------------------+
| _zParseSymbolIndex                       |
| beacons_mon1_mon2_i0_c0                  |
| beacons_mon1_mon2_i0_c0_TblIndex         |
| beacons_mon1_mon2_mon3_i0_c0             |
| beacons_mon1_mon2_mon3_i0_c0_TblIndex    |
| contention_mon1_mon2_mon3_i0_c0_Client   |
| contention_mon1_mon2_mon3_i0_c0_Interval |
| contention_mon1_mon2_mon3_i0_c0_Packet   |
| contention_mon1_mon2_mon3_i0_c0_Time     |
| dirIndex                                 |
| indexIndex                               |
| merged_mon1_mon2_i0_c0                   |
| merged_mon1_mon2_i0_c0_MACIndex          |
| merged_mon1_mon2_i0_c0_MTblIndex         |
| merged_mon1_mon2_mon3_i0_c0              |
| merged_mon1_mon2_mon3_i0_c0_MACIndex     |
| merged_mon1_mon2_mon3_i0_c0_MTblIndex    |
| processed_mon1_mon2_mon3_i0_c0           |
| processed_mon1_mon2_mon3_i0_c0_MACIndex  |
| processed_mon1_mon2_mon3_i0_c0_PTblIndex |
| processed_mon1_mon2_mon3_i0_c0_SymWts    |
| processed_mon1_mon2_mon3_i0_c0_SynthPkts |
| processed_mon1_mon2_mon3_i0_c0_extras    |
| raw_mon1_i0_c0                           |
| raw_mon1_i0_c0_MACIndex                  |
| raw_mon2_i0_c0                           |
| raw_mon2_i0_c0_MACIndex                  |
| raw_mon3_i0_c0                           |
| raw_mon3_i0_c0_MACIndex                  |
| subtypeIndex                             |
| typeIndex                                |
+------------------------------------------+
31 rows in set (0.00 sec)

Algorithm :

Wit uses three processing steps to construct an enhanced trace of system activity.
First, a robust merging procedure combines the necessarily incomplete views
from multiple, independent monitors into a single, more complete trace of wireless
activity. Next, a novel inference engine based on formal language methods
reconstructs packets that were not captured by any monitor and determines
whether each packet was received by its destination.
Finally, Wit derives network performance measures from this
enhanced trace; we show how to estimate the number of stations
competing for the medium.
Instructions: 

The files in this directory are a CRAWDAD dataset hosted by IEEE DataPort. 

About CRAWDAD: the Community Resource for Archiving Wireless Data At Dartmouth is a data resource for the research community interested in wireless networks and mobile computing. 

CRAWDAD was founded at Dartmouth College in 2004, led by Tristan Henderson, David Kotz, and Chris McDonald. CRAWDAD datasets are hosted by IEEE DataPort as of November 2022. 

Note: Please use the Data in an ethical and responsible way with the aim of doing no harm to any person or entity for the benefit of society at large. Please respect the privacy of any human subjects whose wireless-network activity is captured by the Data and comply with all applicable laws, including without limitation such applicable laws pertaining to the protection of personal information, security of data, and data breaches. Please do not apply, adapt or develop algorithms for the extraction of the true identity of users and other information of a personal nature, which might constitute personally identifiable information or protected health information under any such applicable laws. Do not publish or otherwise disclose to any other person or entity any information that constitutes personally identifiable information or protected health information under any such applicable laws derived from the Data through manual or automated techniques. 

Please acknowledge the source of the Data in any publications or presentations reporting use of this Data. 

Citation: Ratul Mahajan, Maya Rodrig, John Zahorjan, CRAWDAD toolset tools/analyze/802.11/Wit (v. 2006‑09‑29),

https://doi.org/10.15783/C7DG6G, Sep 2006.

Dataset Files

LOGIN TO ACCESS DATASET FILES
Open Access dataset files are accessible to all logged in  users. Don't have a login?  Create a free IEEE account.  IEEE Membership is not required.

Documentation

AttachmentSize
File README-v20060929.txt16.71 KB

These datasets are part of Community Resource for Archiving Wireless Data (CRAWDAD). CRAWDAD began in 2004 at Dartmouth College as a place to share wireless network data with the research community. Its purpose was to enable access to data from real networks and real mobile users at a time when collecting such data was challenging and expensive. The archive has continued to grow since its inception, and starting in summer 2022 is being housed on IEEE DataPort.

Questions about CRAWDAD? See our CRAWDAD FAQ. Interested in submitting your dataset to the CRAWDAD collection? Get started, by submitting an Open Access Dataset.