TRANSIT_dataset

- Citation Author(s):
- Submitted by:
- Huize Sun
- Last updated:
- DOI:
- 10.21227/5rp9-9696
- Data Format:
- Links:
- Categories:
- Keywords:
Abstract
TRANSIT dataset is the default dataset of the simulation software Traffic Anomaly Simulation Tool (TRANSIT). You can get the code of TRANSIT in the page:
Instructions:
Introduction
TRANSIT dataset is the default dataset of the simulation software Traffic Anomaly Simulation Tool (TRANSIT). You can get the code of TRANSIT in the page:
bigzeze/TRANSIT: TRaffic ANomaly SImulation Tool
File Structure
Each folder under the root directory represents a type of simulated road network scenario. The second level directories denote anomaly injection types, and the second-level directories store datasets. The data files under each anomaly scenario are similar, including:
- detectors.csv – Raw data file for fixed detector data
- trajectory.csv – Raw data file for floating car data
- detectors.npy – Preprocessed file for fixed detector data
- nodes.npy – Names of fixed detectors, with the same order as
detectors.npy
- events.txt – Event sequence data
Meta Data
For CSV files, refer to the headers.
detectors.npy
It is an array with the shape of [number_of_nodes, number_of_metrics, length_of_time]. The raw detector data is aggregated at the road segment level (nodes). The length of the first dimension in the dataset corresponds to the number of monitored road segments in the network. The second dimension represents the metrics, sequentially storing flow rate (veh/min), occupancy (%), and speed (m/s) for each road segment. The third dimension represents the time, where the interval between two consecutive data points depends on the detection interval length defined in the simulation.
nodes.npy
It is an one-dimension array, storing names of the nodes. It maintains the same order as the detectors.npy
.
events.txt
It saves records for simulation logs and congestion events. For each record, it first records the simulation timestep, followed by a textual description.
Application Prospects
This dataset is currently being used in an ongoing research project—deep learning-based causal discovery for traffic anomalies. It also holds significant potential for applications in training models for anomaly detection, traffic prediction, multimodal time series alignment in transportation, and related tasks.