Classification Data for Road Accidents in Dhaka City

Citation Author(s):
MD Shahriar Tamjid
Notre Dame University Bangladesh
Notre Dame University Bangladesh
Notre Dame University Bangladesh
Narin Nur
Notre Dame University Bangladesh
Submitted by:
Sharif MD Shahr...
Last updated:
Fri, 04/05/2024 - 13:30
Data Format:
Research Article Link:
0 ratings - Please login to submit your rating.


Public safety is seriously threatened by road accidents, which are a major global concern in urban settings. The capital of Bangladesh, Dhaka City, stands out among these locations as a perfect illustration of the complicated difficulties confronted by highly populated cities in ensuring road safety. In this paper, we have used time-series analysis to model the temporal patterns and trends in accident occurrence and machine learning algorithms to identify accident hotspots and comprehend the causes of traffic accidents. We have collected data on road accidents from Accident Research Institute (ARI) of BUET for the period 2007–2021. To forecast the likelihood of an accident occurring at a specific place, we have trained a machine learning model using this data. We have also used time-series analysis to identify trends in road accidents over time. We have found that the junction, traffic conditions, weather conditions, and lighting conditions all have an impact on the likelihood of an accident occurring. The number of traffic accidents in Dhaka City can be decreased by using our data analysis to build focused initiatives. Additionally, we can create educational campaigns to raise awareness of the factors that contribute to road accidents.


This dataset contains information about the daily count of accidents. This dataset has comprised 9 attributes and 20,762 data points. The data preprocessing stage has addressed several challenges. Null values and redundant values have been identified in many columns. Cleaning has been necessary for this dataset, which has been intended for categorical analysis. Columns containing null values and redundant values have been eliminated. Additionally, columns with limited diversity, characterized by only two or three distinct values, which could have hindered the classification process, have also been removed. The resulting dataset has consisted of 9 columns: ID, Year, Month, Location, Accident Intensity, Junction, Traffic Control, Weather, and Lighting. Rows containing null values have been removed to ensure data quality, resulting in a final dataset comprising 17,497 rows.



Submitted by lalith Bondada on Sun, 04/21/2024 - 22:23