Datasets
Open Access
Vehicle speed dataset
- Citation Author(s):
- Submitted by:
- Jiri Vrany
- Last updated:
- Wed, 11/15/2023 - 07:28
- DOI:
- 10.21227/n1z9-e630
- Data Format:
- Link to Paper:
- License:
- Categories:
- Keywords:
Abstract
We obtained this dataset as part of a project to generate a realistic speed profile on a trip specified by GPS coordinates. Specifically, we focused on generating the speed profile for a passenger car traveling on an unfamiliar route, i.e., a route the machine-learning model has yet to see.
The dataset contains 5973 rides of five different passenger cars, with a total length of 9049.3 km. The data was collected during 2021 in the Czech Republic and includes municipal and non-municipal trips.
The maximum allowed speed in the Czech Republic is 50 km/h in a municipality, 90 km/h outside a municipality and 130 km/h on a motorway. In addition, there may be sections with different speed limits.
We use Open Source Routing Machine (OSRM) map-matching algorithms to pair measured spatiotemporal data with Open Street Map (OSM) geographical data. For this purpose, we use Overpass API. Imputed information includes the road type, maximum speed limit, intersection exits, traffic signals or pedestrian crossings. Further, we add elevation and slope data from Open-Elevation API, an open-source elevation API based on Shuttle Radar Topography Mission (SRTM) data.
Citations:
Vrany, J., Krepelka, M., Chumlen, M. (2023). Generating Synthetic Vehicle Speed Records Using LSTM. In: Maglogiannis, I., Iliadis, L., MacIntyre, J., Dominguez, M. (eds) Artificial Intelligence Applications and Innovations. AIAI 2023. IFIP Advances in Information and Communication Technology, vol 675. Springer, Cham. https://doi.org/10.1007/978-3-031-34111-3_12
Acknowledgement:
This dataset was supported by the Technology Agency of the Czech Republic project CK01000020, “Development of a GNSS route generator and CANBUS signal with machine learning using Software Defined Radio”, and project CK02000136, “Virtual Convoy—a comprehensive environment for testing CAR2X communication systems”.
The directory structure of the archive follows this pattern CNR_2021_MM. Car Number (CNR) is the code for testing the vehicle, and MM is a month. The manufacturer, model, engine type and other information for each car are in the attached vehicles.csv file.
Each .csv file represents a single trip. It contains GPS data (Lat, Lon) and the actual measured speed at a given point on the route. The data are indexed by distance, sampled every 1m.
Each trip contains:
Vehicle data: speed, latitude, longitude, elevation
Route geometry data: speed_osrm, categorical variables from OSM.
See file descriptor.json for further details.
To reduce information leakage during training, we did not use attributes with absolute values - latitude, longitude, azimuth, and elevation. However, we kept this information in the dataset. It can be used to verify or add more information to the dataset.
Dataset Files
- vehicle_speed.zip (340.44 MB)
Open Access dataset files are accessible to all logged in users. Don't have a login? Create a free IEEE account. IEEE Membership is not required.