Abstract
Introduction: Road traffic injuries (RTIs) are one of the most important public health problems and causes of mortality worldwide, and especially in Iran.
Methods: We used data from 2017-03-19 to 2021-03-20 registered in RTIs by the East Azerbaijan forensic medicine organization database. Information on predictor variables was obtained from traffic monitoring cameras’ data. We developed eight machine learning prediction models: logistic regression (LR), elastic net regression, decision tree (DT), random forest (RF), extreme gradient boosting (EGB), support vector machines (SVM; linear and non-linear), and artificial neural networks (ANNs). We used RF to evaluate the importance of each predictor in the prediction of death.
Results: The mean number of classes 1, 2, and 4 vehicles on the road on days when death occurred was significantly higher than on days without death and there was an opposite significant pattern for vehicle types 3 and 5. Similar to the training data, RF provided the highest prediction accuracy with an AUC of 91% (95% CI:88%-93%) in the testing data. The total number of type 2 vehicles on the roads is by far the most important and relevant predictor variable (variable importance:83.95) followed by the number of instances of unsafe distance while driving (58.50). The number of Class 4 vehicles (56.58%) and average speed of vehicles (56.31%) were the next most important variables.
Conclusion: Using the RF machine learning algorithm, the occurrence of death in accidents can be predicted with very high accuracy using the number of class 2 vehicles on roads.