Datasets
Open Access
CRAWDAD tools/analyze/802.11/Wit (v. 2006-09-29)
- Citation Author(s):
- Submitted by:
- CRAWDAD Team
- Last updated:
- Thu, 11/09/2006 - 08:00
- DOI:
- 10.15783/C7DG6G
- Data Format:
- License:
- Categories:
- Keywords:
Abstract
A tool to analyze wireless MAC.
Wit is a non-intrusive tool that builds on passive monitoring to analyze the detailed MAC-level behavior of operational wireless networks.
Lastmodified :
2006-11-09
Dataname :
tools/analyze/802.11/Wit
File :
wit-v20060929.tar.gz, README-v20060929
Releasedate :
2006-09-29
References :
mahajan-wit README
Website :
http://www.cs.washington.edu/research/networking/wireless/index.html
Keyword :
RFMON 802.11 frames 802.11 packet trace tcpdump
License :
It can be used and distributed under the following terms: - the software is provided as is, with no warranties or liabilities - research and non-commercial use is permitted - contact the authors for commercial use - the original attribution should exist on any derivative work - the derivative works should be distributed under similar terms
Support :
these scripts are provided as is. they may cause irreversible damage to your psyche; use them at your own risk. they were written hastily and not with a view to sharing with others. as a result, they are ugly, sloppy, and have many idiosyncracies. we have tried to flag these in this README file but you are bound to run into issues. if problems arise, first please carefully look at the source and try to understand what is going on. contact us if things are still unclear. we are very interested in hearing about your experience with them. bugfixes and suggestions for improvement are also appreciated.
Build :
the included scripts use perl, the DBI module of perl and mysql (other databases supported by DBI should also work). install them first. you should be at least a little familiar with sql.
Output :
See "usage" for details about the output of each tool.
Parameters :
See "usage" for details about the parameters needed for each tool.
Usage :
A. inserting traces into the database ------------------------------------- ([NOTE] only the default behavior of the scripts is explained below. their behavior can be modified using command line arguments. look at the scripts for the various options. using the -h flag or incorrect usage will print out a usage message for most scripts.) 1. create the database where you want to store the tables. you can do this with mysqladmin as: % mysqladmin create [dbname] where [dbname] is the name of the database. 2. set up your environment to use the database: export WDB_USER=[username]; # the username to use for accessing the DB export WDB_PASSWD=[password]; # the password export WDB_HOST=[hostname]; # the host where mysql is installed export WDB_DB=[dbname] # the database name to avoid setting up this environment each time, insert these into your shell's .rc file (e.g., .bashrc), or modify localutils.pl which contains the defaults to use when the above environment variables are not defined. 3. run createIndexTablesTable.pl to create some helper index tables in the database. 4. populate the database with the monitor traces. the format of these tables is: mysql> show columns from raw_east_i0_c3; +------------------+----------------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +------------------+----------------------+------+-----+---------+-------+ | id | int(11) | NO | PRI | | | | aFrameTime | decimal(18,7) | YES | | NULL | | | aTimeDelta | int(10) unsigned | YES | | NULL | | | bChannel | tinyint(3) unsigned | YES | | NULL | | | bHostTime | int(10) unsigned | YES | | NULL | | | bMACTime | int(10) unsigned | YES | | NULL | | | bRate | float | YES | | NULL | | | bRssi | smallint(6) | YES | | NULL | | | bSize | smallint(5) unsigned | YES | | NULL | | | cA1_Dest | smallint(5) unsigned | YES | | NULL | | | cA2_Src | smallint(5) unsigned | YES | | NULL | | | cA3 | smallint(5) unsigned | YES | | NULL | | | cA4 | smallint(5) unsigned | YES | | NULL | | | cDSFrom | tinyint(3) unsigned | YES | | NULL | | | cDSTo | tinyint(3) unsigned | YES | | NULL | | | cDur | smallint(5) unsigned | YES | | NULL | | | cFragno | tinyint(3) unsigned | YES | | NULL | | | cMoredata | tinyint(3) unsigned | YES | | NULL | | | cMorefrag | tinyint(3) unsigned | YES | | NULL | | | cOrdered | tinyint(3) unsigned | YES | | NULL | | | cProtver | tinyint(3) unsigned | YES | | NULL | | | cPwrmgt | tinyint(3) unsigned | YES | | NULL | | | cRetry | tinyint(3) unsigned | YES | | NULL | | | cSeqno | smallint(5) unsigned | YES | | NULL | | | cSubtype | tinyint(3) unsigned | YES | | NULL | | | cType | tinyint(3) unsigned | YES | | NULL | | | cWEP | tinyint(3) unsigned | YES | | NULL | | | dBeaconTimestamp | bigint(20) unsigned | YES | | NULL | | | zErr | tinyint(3) unsigned | YES | | NULL | | | zErrDriver | tinyint(3) unsigned | YES | | NULL | | | zErrPhy | tinyint(3) unsigned | YES | | NULL | | +------------------+----------------------+------+-----+---------+-------+ 31 rows in set (0.03 sec) the meanings of the most fields should be apparent (ignore the first letter of the fieldnames) except for maybe the following: - zErr* fields correspond to frames that were not correctly decoded - aFrameTime is the absolute time of the frame in seconds. this is persistent but may not be very precise. - aTimeDelta and bHostTime: ignore these fields - bMACTime is the hardware timestamp of the frame in microseconds. it is 32 bit wide and hence rolls over in rougly 1.2 hours. not all fields are meaningful for all captured frames. to save disk space (and to speed processing), the MAC address fields are stored as indices into a corresponding MACIndex table. the name of the mac index table should be [raw_table_name]_MACIndex. its format is: mysql> show columns from raw_east_i0_c3_MACIndex; +-------+------------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +-------+------------------+------+-----+---------+----------------+ | id | int(10) unsigned | NO | PRI | NULL | auto_increment | | value | char(18) | YES | | NULL | | +-------+------------------+------+-----+---------+----------------+ 2 rows in set (0.00 sec) we have included a script called createRawDataDatabase.pl that creates such tables from tcpdump, prism header logs such as those collected by us at sigcomm04. this script is run as: % createRawDataDatabase.pl -d [dir] [logger] [interface] [channel] where: [dir] is the directory containing the traces the script assumes that the trace filenames are of the format: [dir]/[logger]/*-ath[interface].tcpdump*; if your filenames are different format, modify the following line in the script: my @fileList = split(' ', `ls $rootDir/$logger/*-ath$interface.tcpdump*`); the script uses tethereal to read the traces; you need to point it at the location of tethereal by modifying the following line in the script: my $TETHEREAL = "$ENV{HOME}/usr/local/bin/tethereal"; to insert traces of a different format into the database, you need to change the processPackets subroutine in createRawDataDatabase.pl.
Usage :
B. merging (halfWit) -------------------- ([NOTE] only the default behavior of the scripts is explained below. their behavior can be modified using command line arguments. look at the scripts for the various options. using the -h flag or incorrect usage will print out a usage message for most scripts.) merging happens in multiple, waterfall rounds. two tables are merged in round. the first table can be a result of merging in the previous rounds; the second table is a raw table. a round consists of two steps: a. createBeaconIndex.pl creates a temporary table that contains all the beacons that are common to the two tables. be sure to ignore beacons from APs that do not have monotonically increasing beacon timestamps. b. createMergedTable.pl uses the beacons table to merge the two tables. an example of the usage of these two scripts is shown in merge-commands.sh. that shell script shows how to merge the following five tables: raw_chi_i0_c1 raw_sah_i0_c1 raw_son_i0_c1 raw_kal_i0_c1 raw_moj_i0_c1 createBeaconIndex.pl takes as input the names of the two tables being merged and the desired name for the beacons table, which should be of the form beacons_*. createMergedTable.pl takes as input the name of the beacons table. from beacons_*, it produces the merged table called merged_*. for us, merging the traces of kal(ahari) was a little bit more problematic because that monitor was being shut down at nights. merge-commands.sh also shows how such monitors can be merged. the merged tables contain more columns than the raw tables. the number of additional columns depends on how many raw tables have been merged so far. the extra columns in a merge of three raw tables will have the following additional columns: | zzFromTable | tinyint(3) unsigned | YES | | NULL | | | zzid_0 | int(11) | YES | | NULL | | | zzbRssi_0 | smallint(6) | YES | | NULL | | | zzTDiff_0 | tinyint(3) unsigned | YES | | NULL | | | zzid_1 | int(11) | YES | | NULL | | | zzbRssi_1 | smallint(6) | YES | | NULL | | | zzTDiff_1 | tinyint(3) unsigned | YES | | NULL | | | zzid_2 | int(11) | YES | | NULL | | | zzbRssi_2 | smallint(6) | YES | | NULL | | | zzTDiff_2 | tinyint(3) unsigned | YES | | NULL | | +------------------+----------------------+------+-----+---------+----------------+ - zzFromTable: the input table from which this frame came from (any one if it was present in multiple tables). - zzid_[k]: the corresponding id of this frame in the k-th raw table. it is null if this frame was not present in k-th raw table. - zzbRssi_[k]: similar to the above. - zzTDiff_[k]: the time difference in translated terms between the timestamp in the k-th raw table and the timestamp assigned to the merged frame. (this can be used to gauge the quality of time synchronization.)
Usage :
C. inference (nitWit) -------------------- ([NOTE] only the default behavior of the scripts is explained below. their behavior can be modified using command line arguments. look at the scripts for the various options. using the -h flag or incorrect usage will print out a usage message for most scripts.) lostPackets.pl implements the functionality of nitWit. it can be run simply as: % ./lostPackets.pl [tablename] where [tablename] can either be a raw or a merged table. this scripts mainly produces three tables -- processed, extras, and synthpkts. apart from some columns that it shares with the raw and merged tables, the processed table contains the following columns: | _zBSSID | smallint(5) unsigned | YES | | NULL | | | _zDir | tinyint(3) unsigned | YES | | NULL | | | _zWasrcvd | tinyint(3) unsigned | YES | | NULL | | +------------+----------------------+------+-----+---------+-------+ - _zBSSID is the inferred BSSID of the packet. not all 802.11 packets contain the BSSID field in their headers. inserting this here simplifies per packet processing. similarly, the cA2_Src is filled out properly with the inferred source of the packet if its absent (e.g., for ACKs). - _zDir represents the direction of the packet. the meanings of the values are contained in the dirIndex table created by createIndexTablesTable.pl. mysql> select * from dirIndex; +----+--------+ | id | value | +----+--------+ | 0 | up | | 1 | down | | 2 | ad hoc | | 3 | ap-ap | +----+--------+ 4 rows in set (0.02 sec) - _zWasrcvd represents whether the frame was received by its destination. the extras table contains some auxillary information for the frames. it is separate from the processed table to reduce the processing time of the scripts that operate on the processed table, which is restricted to commonly used fields. the synthpkts table contains the inferred packets. for an input table called raw_[details] or merged_[details], by default, the three tables will be called processed_[details], processed_[details]_extras, processed_[details]_SynthPkts. lostPackets.pl takes a bunch of options. (for details, see the script or use the -h flag.) a couple of important ones are: -e regexpFile : the file containing the regular expression -w wtfile : the file containing symbol weights the script also uses the perl Parse::RecDescent package. it is included for convenience.
Usage :
D. contenders (dimWit) --------------------- ([NOTE] only the default behavior of the scripts is explained below. their behavior can be modified using command line arguments. look at the scripts for the various options. using the -h flag or incorrect usage will print out a usage message for most scripts.) computeContention2.pl estimates the number of contenders at the instant when each packet was sent. run is simply as: % computeContention2.pl [tablename] it take a processed table (see above) as input. to simplify the computation of aggregate statistics as function of #contenders, it summarizes its results in two main tables. given processed_[details], the two tables will be called contention_[details]_Time and contention_[details]_Packet. - the _Time table summarizes how a client spends its time -- whether it was idle, deferring, busy, etc. -- at each contention level. - the _Packet table summarizes the details of clients' packets -- their rate, direction, reception status, etc. -- at each contention level. look at the columns of the two tables (and the source) to get a better sense of what's in them.
Example :
we have provided a toy database as an example to help you get started. the toy db contains 3 raw tables from our simulator experiments. the script wit_toydb.sh contains commands to run halfwit, nitwit, and dimwit over them. 1. assuming that you have already installed mysql, set the environment to use it: % export WDB_HOST=[host] % export WDB_USER=[username] % export WDB_PASSWD=[password] (the script below will automatically set the WDB_DB variable) 2. simply run the provided shell script which should do the following % ./wit_toydb.sh this will generate a bunch of output and warnings. if everything is working fine, your output should look like that in wit_toydb.log. (you could pipe the output of wit_toydb.sh to a file and compare it with the provided wit_toydb.log.) 3. by now, you should have a few logs in your working directory and more tables in your database. to look at the tables: % mysql -h [host] -u [username] -p[password] wit_toydb this should bring up the mysql shell, at which point, do: mysql> show tables; the output of this command should display: +------------------------------------------+ | Tables_in_wit_toydb | +------------------------------------------+ | _zParseSymbolIndex | | beacons_mon1_mon2_i0_c0 | | beacons_mon1_mon2_i0_c0_TblIndex | | beacons_mon1_mon2_mon3_i0_c0 | | beacons_mon1_mon2_mon3_i0_c0_TblIndex | | contention_mon1_mon2_mon3_i0_c0_Client | | contention_mon1_mon2_mon3_i0_c0_Interval | | contention_mon1_mon2_mon3_i0_c0_Packet | | contention_mon1_mon2_mon3_i0_c0_Time | | dirIndex | | indexIndex | | merged_mon1_mon2_i0_c0 | | merged_mon1_mon2_i0_c0_MACIndex | | merged_mon1_mon2_i0_c0_MTblIndex | | merged_mon1_mon2_mon3_i0_c0 | | merged_mon1_mon2_mon3_i0_c0_MACIndex | | merged_mon1_mon2_mon3_i0_c0_MTblIndex | | processed_mon1_mon2_mon3_i0_c0 | | processed_mon1_mon2_mon3_i0_c0_MACIndex | | processed_mon1_mon2_mon3_i0_c0_PTblIndex | | processed_mon1_mon2_mon3_i0_c0_SymWts | | processed_mon1_mon2_mon3_i0_c0_SynthPkts | | processed_mon1_mon2_mon3_i0_c0_extras | | raw_mon1_i0_c0 | | raw_mon1_i0_c0_MACIndex | | raw_mon2_i0_c0 | | raw_mon2_i0_c0_MACIndex | | raw_mon3_i0_c0 | | raw_mon3_i0_c0_MACIndex | | subtypeIndex | | typeIndex | +------------------------------------------+ 31 rows in set (0.00 sec)
Algorithm :
Wit uses three processing steps to construct an enhanced trace of system activity. First, a robust merging procedure combines the necessarily incomplete views from multiple, independent monitors into a single, more complete trace of wireless activity. Next, a novel inference engine based on formal language methods reconstructs packets that were not captured by any monitor and determines whether each packet was received by its destination. Finally, Wit derives network performance measures from this enhanced trace; we show how to estimate the number of stations competing for the medium.
The files in this directory are a CRAWDAD dataset hosted by IEEE DataPort.
About CRAWDAD: the Community Resource for Archiving Wireless Data At Dartmouth is a data resource for the research community interested in wireless networks and mobile computing.
CRAWDAD was founded at Dartmouth College in 2004, led by Tristan Henderson, David Kotz, and Chris McDonald. CRAWDAD datasets are hosted by IEEE DataPort as of November 2022.
Note: Please use the Data in an ethical and responsible way with the aim of doing no harm to any person or entity for the benefit of society at large. Please respect the privacy of any human subjects whose wireless-network activity is captured by the Data and comply with all applicable laws, including without limitation such applicable laws pertaining to the protection of personal information, security of data, and data breaches. Please do not apply, adapt or develop algorithms for the extraction of the true identity of users and other information of a personal nature, which might constitute personally identifiable information or protected health information under any such applicable laws. Do not publish or otherwise disclose to any other person or entity any information that constitutes personally identifiable information or protected health information under any such applicable laws derived from the Data through manual or automated techniques.
Please acknowledge the source of the Data in any publications or presentations reporting use of this Data.
Citation: Ratul Mahajan, Maya Rodrig, John Zahorjan, CRAWDAD toolset tools/analyze/802.11/Wit (v. 2006‑09‑29),
https://doi.org/10.15783/C7DG6G, Sep 2006.
Dataset Files
- wit-v20060929.tar.gz (3.91 MB)
- wit-readme-v20060929.txt (17.11 kB)
Open Access dataset files are accessible to all logged in users. Don't have a login? Create a free IEEE account. IEEE Membership is not required.
Documentation
Attachment | Size |
---|---|
README-v20060929.txt | 16.71 KB |
These datasets are part of Community Resource for Archiving Wireless Data (CRAWDAD). CRAWDAD began in 2004 at Dartmouth College as a place to share wireless network data with the research community. Its purpose was to enable access to data from real networks and real mobile users at a time when collecting such data was challenging and expensive. The archive has continued to grow since its inception, and starting in summer 2022 is being housed on IEEE DataPort.
Questions about CRAWDAD? See our CRAWDAD FAQ. Interested in submitting your dataset to the CRAWDAD collection? Get started, by submitting an Open Access Dataset.