Datasets
Standard Dataset
The OSHA dataset: Overtaking on Simulated HighwAys
- Citation Author(s):
- Submitted by:
- IECCResearch VW
- Last updated:
- Wed, 01/10/2024 - 16:51
- DOI:
- 10.21227/ha78-f960
- Links:
- License:
- Categories:
- Keywords:
Abstract
Volkswagen Group of America Innovation and Engineering Center California (VW IECC) is a research facility in Belmont, California working on the future of the mobility. In the recent years exciting developments have happened for the autonomous vehicles. In general, lack of data is the main problem to tackle to solve the task of autonomous driving. One of the important tasks in this topic is the overtaking and lane changes, especially in the highway scenarios. Due to the cost and challenges of real-world data collection, specifically in a highway environment, a simulation environment named SimPilot was developed by Innovation Center Europe (ICE) for data collection in highway scenarios. To solve the task of overtaking and automatic lane changing, we release a large-scale (more than 8M frames) dataset called Overtaking in Simulated HighwAy (OSHA) of Ego vehicles driving in a highway environment with traffic, consisting of lane ID sensor maps and data frame recorded at each frame. The scene is a 3.2 km long loop consisting of 3 lanes with different traffic densities at different episodes. Other movable objects present in the scene are limited to vehicles (no busses, bikes, trucks, or pedestrians) and are controlled by SUMO.
The Ego was controlled by an in-house developed rule-based algorithm to ensure the maximum safety and correctness of every lane change, as well as following the traffic rules and speed limits. The 8M frames were recorded at each step, where steps are 50ms apart from each other (about 50 hrs of Ego driving). Moreover, traffic densities are set by number of vehicles per km which varies from 5 to 35. On top of that, the speed limit is randomly changed at each episode and different sections of the road have different speed limits.
For more detailed documentation and instructions please read the pdf available on IEEEDataPort in the Documentation section on the right side.
We are releasing two datasets, both raw and pre-processed dataset that was used in our paper : link to the paper. Difference between these two datasets can be seen in the table below:
raw columns:
initials, time, milestone, task, eps, step, pythonTime, ego_speed, position_x, poisition_y, timestamp, heading_x, heading_y, acceleration_x, acceleration_y, orientation, continuous_lane_id, lane_relative_t, angle_to_lane, controller_state, vehicle_switching_lane, traj_pose_x, traj_pose_y, traj_pose_v, control_points, static_lanes, image_name, speed_limit, expert_type, collision_type, left_lane_available, right_lane_available, allowed_speed, movable_obj, speed_action, lane_change_command, travel_assist_lane_change_state
pre-processed columns:
index, initials, time, milestone, task, eps, step, pythonTime, ego_speed, position_x, poisition_y, timestamp, heading_x, heading_y, acceleration_x, acceleration_y, orientation, continuous_lane_id, lane_relative_t, angle_to_lane, controller_state, vehicle_switching_lane, static_lanes, image_name, speed_limit, expert_type, collision_type, left_lane_available, right_lane_available, allowed_speed, movable_obj, speed_action, lane_change_command, travel_assist_lane_change_state
future_x_local_array, future_y_local_array, future_v_global_array, future_points, future_ta_lane_change_array, car_matrix, lane_change_command_modified, sorted_movable_obj, movable_obj_EucDist
Dataset Files
- Validation dataset used for PyTorch lightning validation. validation.zip (178.00 MB)
- Single processed dataframe used for training. df.zip (9.94 GB)
- Lane ID sensor images. images.zip (4.69 GB)
- Raw data collected in each episode. raw.zip (935.66 MB)
- Serialized dataset serialized.zip (22.16 GB)
- Models from the paper and evaluation results paper-models.zip (418.48 MB)
Documentation
Attachment | Size |
---|---|
IEEE dataport-v18-20231218_191835.pdf | 4.9 MB |