INTRODUCTION

Traffic collisions and accidents in the US have had a major impact on the economy, causing billions of dollars in property damage, medical expenses, and loss of life each year. They can also result in decreased productivity and have indirect impacts on the economy such as decreased tourism, decreased investment, and increased insurance premiums and taxes. Road safety programs, regulations, infrastructure, and technology need to be implemented and invested in to mitigate these effects. 

Every year, thousands of people are killed or injured in motor vehicle accidents, resulting in significant economic and social costs. The National Highway Traffic Safety Administration (NHTSA) is responsible for collecting data on traffic crashes and fatalities in order to better understand the causes of accidents and to develop strategies to reduce the number of crashes and fatalities. Given the yearly increase in road accident rates, an investigation into US traffic accidents and fatalities is of utmost importance and should be approached using data-driven methodologies. The use of data can provide valuable insights into the causes, trends, and patterns of traffic accidents and fatalities, as well as inform strategies and initiatives aimed at reducing the number of accidents and fatalities on US roads. By analyzing data from various sources, such as police reports, vehicle crash data, and weather reports, it is possible to gain a deeper understanding of the factors that contribute to traffic accidents and fatalities in the US. This information can then be used to develop and implement effective road safety initiatives that can save lives and reduce the number of accidents and fatalities on US roads. 

Moreover, with the rise of autonomous and semi-autonomous vehicles, it is becoming increasingly important to study the relationship between human drivers and machine drivers in order to ensure their safe coexistence on the roads. As such, the analysis of US traffic accidents and fatalities has significant implications for the future of transportation, as well as for public health and safety. Through the use of advanced analytical techniques, it is possible to extract meaningful insights from large and complex datasets, and to develop models that can accurately predict the likelihood of accidents and fatalities. By leveraging these insights and models, policymakers, regulators, and transportation companies can work together to develop and implement effective safety strategies and initiatives that reduce the risk of accidents and fatalities on US roads. 

Traffic accidents can occur for a variety of reasons, including driver error, road conditions, vehicle malfunctions, and external factors such as weather and road construction. Understanding the factors that contribute to accidents is crucial for developing effective strategies to prevent them. By analyzing the US traffic accident data, it is possible to identify trends and patterns in the causes and outcomes of accidents. This information can then be used to inform policy decisions and initiatives aimed at improving road safety and reducing the impact of accidents on individuals and the wider community. Additionally, the insights gained from this analysis can help to shape future research and development efforts in the field of transportation and road safety. 

Beyond the economic and societal costs, traffic accidents also have a profound impact on public health and safety. Lives are lost, families and communities are devastated, and individuals suffer from injuries that can have long-lasting physical, emotional, and financial repercussions. Vulnerable road users, such as pedestrians, cyclists, and motorcyclists, are particularly at risk. Understanding the complex factors that contribute to traffic accidents and fatalities is critical for developing evidence-based strategies to prevent them and promote road safety for all users. Furthermore, advancements in technology, such as telematics, sensor-based systems, and machine learning algorithms, have the potential to revolutionize road safety by enabling real-time monitoring, prediction, and prevention of accidents. Exploring the use of cutting-edge technologies in analyzing traffic accident data can provide valuable insights and inform the development of innovative solutions to address the ongoing challenges of road safety in the US.

Motivation

Studying traffic crashes and fatalities is important because it provides insight into the root causes of these incidents and can help identify ways to prevent them in the future. With the increasing number of vehicles on the road, it is essential to understand the factors that contribute to crashes and fatalities, so that effective measures can be taken to reduce their occurrence. Understanding traffic crashes and fatalities is also important for policymakers and government agencies, as it allows them to develop evidence-based policies that promote road safety and reduce the number of crashes and fatalities. By exploring this topic, one has the opportunity to contribute to a critical area of research and make a positive impact on road safety.

Objective

The ultimate goal of this study is to investigate the many intricacies of what the data indicates by using historical data and trying to find any ways to anticipate certain factors that contribute to traffic accidents thereby helping reduce the number of traffic accidents and make our roads safer for everyone. 

Predictive modeling involves the use of mathematical algorithms and statistical methods to analyze historical data and make predictions about future events. In the context of traffic accidents, predictive modeling can help identify high-risk areas and times, and also identify factors such as road design, weather conditions, and driver behavior that are contributing to accidents. 

This type of work can provide valuable information for traffic management and accident prevention. Not only can governments and corporations benefit from this, but also individuals. People can take precautions to lower their personal risk of getting in an accident by being aware of high-risk regions and times as well as the factors that lead to traffic accidents. They can decide to avoid driving in specific regions during high-risk times, for instance, or they might enroll in a defensive driving course to hone their techniques and drive more safely. Additionally, businesses can take precautions to lower the likelihood of accidents, such as enhancing the design of the roads or initiating public awareness programs to promote safe driving habits.

Related Work

Gutierrez-Osorio, C.; González, F.A.; Pedraza, C.A. Deep Learning Ensemble Model for the Prediction of Traffic Accidents Using Social Media Data. Computers 2022, 11, 126. https://doi.org/10.3390/ computers11090126

This paper presents a study on using an ensemble deep learning model to predict road accident probability using spatiotemporal information from social media and meteorological sources. The authors first clean the raw data and perform feature engineering to create a suitable input for the deep learning models. The ensemble model used is comprised of GRU, CNN, and another CNN to extract patterns, relationships, and underlying connections in the road accident data. The results show that the proposed model performs better than the benchmark models and provides valuable insights for traffic control agencies to plan road accident prevention activities. However, the limitations of the study include the dependence of the predictions on the input data and the inability to incorporate important factors such as driver and pedestrian culture into the model. The authors plan to integrate additional data in future work to improve the model's predictions.

Santos, D.; Saias, J.; Quaresma, P.; Nogueira, V.B. Machine Learning Approaches to Traffic Accident Analysis and Prediction. Computers 2021, 10, 157. https:// doi.org/10.3390/computers10120157

This paper explores the use of machine learning algorithms to analyze traffic accidents and predict hotspots. The authors collect traffic accident data and use various machine learning techniques, including decision trees, random forests, and support vector machines, to analyze the data and identify the factors that contribute to traffic accidents. The results show that these machine learning algorithms can effectively predict hotspots and provide valuable insights into the causes of traffic accidents. The paper highlights the potential of machine learning for traffic accident analysis and provides a valuable contribution to the field of traffic safety.

Gudemupati, S.S.R., Chao, Y.L., Kotikalapudi, L.P. and Ceesay, E., 2022. Prevent Car Accidents by Using AI. arXiv preprint arXiv:2206.11381.

The paper focuses on the crucial area of reducing accidents by predicting their severity. The study is conducted in Virginia State and employs three modeling algorithms, namely logistic regression, K-Nearest Neighbors (KNN), and Random Forest, to predict accident severity. The authors of the paper acknowledge that there is room for improvement in their methodology, and suggest that future work should involve investigating more algorithms to verify the results and improve prediction accuracy. Additionally, the study suggests that it would be useful to investigate the relationship between the severity of car accidents and the state in which they occur. The authors suggest that this could be done by computing relevant statistics and examining the link between the time of year and the severity of accidents. Furthermore, the paper suggests that there may be a possibility of including the model into a real-time accident risk prediction model to investigate the detailed link between critical components and accident severity in greater depth. In conclusion, the paper highlights the importance of paying closer attention to the severity of automotive accidents and suggests future research in this area.

Below are some of the questions that we will be focusing to answer through the course of this project.

Q&A?

After analyzing the data, it was observed that there has been an increasing trend in annual accident counts over the recent years. This trend highlights the need for a comprehensive approach to improving road safety, as it can have significant consequences in terms of property damage, physical injury, increased traffic congestion, emergency response time, and economic costs. The findings suggest the importance of implementing effective measures to improve road safety and reduce the risk of accidents.

Based on the analysis of recorded accidents by weather conditions, it was found that the maximum number of accidents occurred when the weather was clear. This suggests that weather conditions alone may not be the only contributing factor to accidents. Other factors such as traffic volume, driver behavior, and road conditions may also play a significant role in the number of accidents. Therefore, it is important to conduct further analysis to identify the root causes of accidents and develop appropriate measures to reduce them.

The analysis of data on fatal traffic accidents suggests that there is a significant difference in driver involvement by gender, with male drivers being more likely to be involved in fatal crashes than female drivers. Women appear to have a better safety record when it comes to driving, although further research may be needed to understand the underlying factors behind this trend.

After analyzing the data, it was found that Miami and Los Angeles have the highest number of traffic accidents among all cities in the US. California, being the most populated state, has the highest number of recorded accidents with over 700,000 accidents. Florida and Texas follow California with the second and third-highest accident rates, respectively. 

To understand the underlying causes of accidents and develop solutions to reduce them, further analysis is needed to explore additional variables such as traffic volume, weather conditions, road infrastructure, and driver behavior. Improving road safety is crucial to reducing the economic and social costs associated with accidents, such as property damage, injuries, and traffic congestion. A comprehensive strategy that emphasizes safe driving practices and reduces the number of accidents is necessary to improve road safety in the US.

While the available data is not sufficient to draw a definitive conclusion, the data suggests that vehicle type can have an impact on the likelihood of accidents. Specifically, pickup trucks have been identified as a contributing factor to higher fatality rates in some states. The reasons for this are complex and multifaceted, and likely involve a combination of factors such as the design of the vehicle, driving conditions, driver behaviors, and demographics. Therefore, further research and analysis are needed to fully understand the relationship between vehicle type and accident rates.

Upon exploration of the data, it was found that accidents are more likely to occur during the weekdays than on weekends, with Friday having the highest number of accidents among all weekdays. The peak time for accidents is between 3 PM and 6 PM. Additionally, December records the most accidents among other months, which could be attributed to various factors such as weather conditions, increased holiday travel, increased alcohol consumption, and reduced visibility due to shorter days. However, it is important to note that the analysis only shows a correlation between these factors and accidents, and other factors such as driver behavior and road conditions could also play a role.

After analyzing the crash data and drunk driving rates in different states, it is evident that alcohol is not the predominant factor contributing to higher traffic fatality rates. While states with higher rates of drunk driving tend to have higher fatality rates, there are other factors at play, such as road infrastructure, driver behavior, and enforcement of traffic laws, that may also impact fatality rates. Therefore, it is important to consider all these factors when formulating strategies to reduce traffic fatalities.

The analysis involved identifying and labeling Interstate Highways and then checking for any significant difference in the severity and distance of accidents that occurred on interstate roads compared to other roads. The results indicated that there was very little difference in the severity of accidents between the two road types. However, the mean distance of accidents that occurred on interstate roads was found to be longer than on other road types. These findings suggest that although accidents occurring on Interstates may not be significantly more severe, they may be more likely to occur at a greater distance from the location of the accident than those occurring on other roads. It is important to keep in mind that these results are based on the available data and may not be representative of all accidents that occur on Interstates or other roads.

The answer to this question is Yes. Historical data can be utilized to create models that can predict the severity of a traffic crash. The study demonstrated the effectiveness of utilizing data to analyze and predict the severity of traffic accidents. Machine learning techniques such as decision trees, Naive Bayes, and Support Vector Machine classifiers have shown promising results in predicting the severity of traffic accidents in the US Accidents dataset based on various features such as location, time, and weather conditions. The machine learning models can identify patterns and relationships between different features and the severity of accidents, which can help in implementing proactive measures such as optimizing traffic management strategies, improving road safety measures, and allocating appropriate resources for emergency response. It is essential to select an appropriate machine learning algorithm and fine-tune hyperparameters to enhance the performance of the model.

Yes, Association rule mining modeling can be used to identify patterns and relationships between different factors that may contribute to the likelihood of a crash. By preprocessing a traffic crashes dataset and applying data mining techniques such as the Apriori algorithm, frequent itemsets, and association rules can be discovered and interpreted to identify the common causes of traffic crashes and their relationships. Based on the insights gained, strategies can be developed to tackle the common causes of traffic crashes and minimize their incidence, such as enhancing public awareness campaigns or improving road infrastructure. Association rule mining analysis can aid in providing valuable insights into the relationships between different crash factors and help inform efforts to reduce the frequency and severity of crashes.

Yes, it is possible to use advanced techniques such as a Convolutional Neural Network (CNN) to identify accidents from traffic cam footage in real time. By training the CNN on a vast and varied dataset of accident and non-accident images extracted from real-time CCTV footage, the model can learn to distinguish various types of accidents based on visual characteristics present in the images. The high accuracy achieved by the CNN model on the test and validation sets confirms its effectiveness in identifying accidents. Deploying such a model on surveillance cameras can potentially improve emergency response times and save lives. However, there are challenges to be addressed, such as the accuracy and reliability of the model in real-world scenarios and the ethical and legal considerations regarding the use of surveillance cameras and AI models in public spaces. Further research is needed to explore the potential of this approach.

A targeted approach focused on specific states or cities can indeed enhance road safety. Employing machine learning techniques such as clustering can help in identifying clusters of US states or cities based on factors like fatality trends or primary causes of accidents. Such clusters can provide valuable insights into the factors contributing to these patterns, which can guide policymakers and stakeholders in developing effective strategies to improve road safety and reduce fatalities. Clustering can also help identify exemplary groups to serve as a model for others. However, it's crucial to keep in mind that other factors such as road infrastructure, vehicle safety standards, and driver behavior can also influence fatality rates and should be considered in a comprehensive road safety analysis.