This dataset contains mobility traces of taxi cabs in San Francisco, USA. It contains GPS coordinates of approximately 500 taxis collected over 30 days in the San Francisco Bay Area.
date/time of measurement start: 2008-05-17
date/time of measurement end: 2008-06-10
collection environment: This data set contains mobility traces of taxi cabs in San Francisco, USA. It contains GPS coordinates of approximately 500 taxis collected over 30 days in the San Francisco Bay Area.
Cab mobility traces are provided by the Exploratorium - the museum of science, art and human perception through the cabspotting project: http://cabspotting.org .
Cabspotting is designed as a living framework to use the activity of commercial cabs as a starting point to explore the economic, social, political and cultural issues that are revealed by the cab traces. Where do cabs go the most? Where do they never turn up? Cab Projects are vehicles for artists, writers, or researchers to explore these issues in the form of a small experiment, investigation or observation.
network configuration: Cab mobility traces are provided by the Exploratorium - the museum of science, art and human perception through the cabspotting project: http://cabspotting.org .
"Each San Francisco based Yellow Cab vehicle is currently outfitted with a GPS tracking device that is used by dispatchers to efficiently reach customers. The data is transmitted from each cab to a central receiving station, and then delivered in real-time to dispatch computers via a central server. This system broadcasts the cab call number, location and whether the cab currently has a fare."(*)
You can use this data set of cab mobility traces that were collected in May 2008. (*) http://cabspotting.org/about.html
data collection methodology: Each taxi is equipped with a GPS receiver and sends a location-update (timestamp, identiﬁer, geo-coordinates) to a central server. The location-updates are quite ﬁne-grained - the average time interval between two consecutive location updates is less than 10 sec, allowing us to accurately interpolate node positions between location-updates.
sanitization: Out of respect for the privacy of cab drivers and customers, no direct access to cab data with identifiable cab numbers is provided to the public. The project emphasizes the analysis of aggregate data and data patterns, with most of this analysis happening on historic data and larger data patterns using scrambled cab numbers.
This traceset contains mobility traces of taxi cabs in San Francisco, USA. It contains GPS coordinates of approximately 500 taxis collected over 30 days in the San Francisco Bay Area.
- file: cabspottingdata.tar.gz
- measurement purpose: User Mobility Characterization, Location-aware Computing, Human Behavior Modeling
- methodology: The cab locations are not stored by Yellow Cab, but only used in real-time to aid dispatch. Our system talks to the Yellow Cab server and stores the data in a database, encoding the call number for privacy. Server-side processes computer the aggregate map at various time intervals (10 minute, 1 hour, 8 hours, etc.) and store these frames as Postscript and bitmap images. These are subsequently combined into movies for every day, week, etc. Images and movies can be queried by visitors to the site in the Time Lapse area. A sample of real-time data overlaid on the most recent map can be seen in the Cab Tracker client. You can collect your own cab mobility traces following the instructions from http://cabspotting.org/api.
- may_2008: Mobility traces of taxi cabs in San Francisco, USA
- configuration: This archive contains file '_cabs.txt' with the list of all cabs and for each cab its mobility trace in a separate ASCII file, e.g. 'new_abboip.txt'.
- format: The format of each mobility trace file is the following - each line contains [latitude, longitude, occupancy, time], e.g.: [37.75134 -122.39488 0 1213084687], where latitude and longitude are in decimal degrees, occupancy shows if a cab has a fare (1 = occupied, 0 = free) and time is in UNIX epoch format.
The files in this directory are a CRAWDAD dataset hosted by IEEE DataPort.
About CRAWDAD: the Community Resource for Archiving Wireless Data At Dartmouth is a data resource for the research community interested in wireless networks and mobile computing.
CRAWDAD was founded at Dartmouth College in 2004, led by Tristan Henderson, David Kotz, and Chris McDonald. CRAWDAD datasets are hosted by IEEE DataPort as of November 2022.
Note: Please use the Data in an ethical and responsible way with the aim of doing no harm to any person or entity for the benefit of society at large. Please respect the privacy of any human subjects whose wireless-network activity is captured by the Data and comply with all applicable laws, including without limitation such applicable laws pertaining to the protection of personal information, security of data, and data breaches. Please do not apply, adapt or develop algorithms for the extraction of the true identity of users and other information of a personal nature, which might constitute personally identifiable information or protected health information under any such applicable laws. Do not publish or otherwise disclose to any other person or entity any information that constitutes personally identifiable information or protected health information under any such applicable laws derived from the Data through manual or automated techniques.
Please acknowledge the source of the Data in any publications or presentations reporting use of this Data.
Michal Piorkowski, Natasa Sarafijanovic-Djukic, Matthias Grossglauser, epfl/mobility, https://doi.org/10.15783/C7J010 , Date: 20090224
- cabspottingdata.tar.gz (90.68 MB)
Open Access dataset files are accessible to all logged in users. Don't have a login? Create a free IEEE account. IEEE Membership is not required.
These datasets are part of Community Resource for Archiving Wireless Data (CRAWDAD). CRAWDAD began in 2004 at Dartmouth College as a place to share wireless network data with the research community. Its purpose was to enable access to data from real networks and real mobile users at a time when collecting such data was challenging and expensive. The archive has continued to grow since its inception, and starting in summer 2022 is being housed on IEEE DataPort.