Harnessing GIS and Machine Learning for Accurate Traffic Flow Prediction in Tehran, Iran

Traffic presents a major concern in urban areas, primarily due to increased movement from rural backgrounds, resulting in larger populations and more vehicles. Worldwide, traffic congestion has escalated into a critical challenge in cities, delaying travel times, escalating fuel usage, and causing greater environmental pollution. Authorities are tackling these challenges through various methods, including intelligent transportation systems that depend on traffic prediction for optimization. Reliable forecasting of traffic flow is essential for managing traffic, controlling lights, and planning routes, allowing drivers to make well-informed choices and consider alternative paths if needed.

Conventional traffic management approaches typically require costly and lengthy infrastructural changes. A more effective solution involves data-driven traffic flow predictions. One particularly promising method integrates Geographic Information System (GIS) technology with machine learning to improve forecasting accuracy. With precise traffic forecasts, urban planning and real-time decision-making become more effective for commuters, traffic managers, and city planners.

The ability to accurately predict congestion hotspots can substantially alleviate various traffic-related challenges. For example, optimizing route selection can lead to a significant reduction in air pollution by lessening vehicle emissions. Moreover, effective route planning not only saves time but also minimizes delays, fostering a smoother traffic flow. Analyzing congestion patterns is essential for accident prevention, as it helps mitigate risks associated with congested roadways. Additionally, reducing idling time contributes to fuel conservation, ultimately enhancing overall fuel efficiency. This comprehensive approach can result in a more sustainable and safer driving experience.

Geographic Information Systems (GIS) play a crucial role in traffic prediction by integrating spatial and temporal data, which greatly enhances the reliability of forecasting models. This blog post explores the research conducted by Mehdi Babaei and Saeed Behzadi from Shahid Rajaee Teacher Training University in Tehran, Iran, focusing on the prediction of traffic flow utilizing machine learning techniques in conjunction with GIS.

The study focused on the analysis of traffic patterns in Tehran, Iran, employing a Web GIS system that updated traffic conditions at regular intervals of 15 minutes. The dataset collected encompassed various critical variables, including geospatial coordinates (latitude and longitude), temporal information (encompassing the day of the week as well as holidays), weather conditions (categorized into three distinct levels), and traffic congestion levels (classified into five categories ranging from minimal traffic to significantly high traffic). With in excess of 2 million records analyzed, data preprocessing was instrumental in enhancing the accuracy of predictions. The measures undertaken included noise reduction, data normalization, and the management of missing values to ensure the dataset’s reliability for comprehensive analysis.

The study assessed five distinct machine learning algorithms for traffic level classification: Decision Tree (DT), K-Nearest Neighbors (KNN), discriminant analysis, Naïve Bayes, and artificial neural networks (ANN). The Decision Tree (DT) algorithm uses a hierarchical structure comprising a root node, internal nodes that pose queries, branches that represent possible outcomes, and leaf nodes that yield predictions. KNN is a straightforward, nonparametric regression method prevalent in data mining, machine learning, and pattern recognition, classifying data points by examining their neighbors and often used in forecasting analysis. Discriminant analysis functions as a multivariate statistical approach that differentiates observation groups based on multiple variables, evaluating each variable’s role in that distinction. The Naïve Bayes classifier, based on Bayes’ theorem, formulates predictions for specific classes by presuming variable independence. It utilizes Bayesian theory for forecasting future events based on current data. ANNs are adept at recognizing patterns and classifications in fields such as simulation and pattern recognition, composed of artificial neurons (nodes) and synapses (edges), classified into feed-forward and feedback networks. A particular two-layer feed-forward network featuring hidden sigmoid neurons is employed. The hyperparameter configurations for the algorithms are as follows:

For DT,

Max Number of Splits: 300; Split Criterion: Gini’s Diversity

For KNN,

K = 1; Distance Metric: City Block; Distance Weight: Inverse

For Discriminant Analysis,

Discrimtype: Quadratic

For Naive Bayes,

Kernel type: Gaussian; Support: Unbounded

For ANN,

Number of Hidden Nodes = 30,45,55; Epochs = 1000; Function: scaled conjugate gradient backpropagation

Prior to conducting the analysis, the dataset was divided into 70% for training and 30% for testing. The performance of the models was assessed using both accuracy and the kappa metric. The results indicate the following outcomes: for Decision Trees (DT), the accuracy was 54.73% with a kappa value of 0.4; for K-Nearest Neighbors (KNN), the accuracy reached 96.14% and the kappa value was 0.95; for Discriminant Analysis, the accuracy was 41.57% with a kappa of 0.27; for Naive Bayes, the accuracy stood at 52.14% and the kappa was 0.38; and finally, for Artificial Neural Networks (ANN), the accuracy was 57.76% with a kappa value of 0.45.

These findings suggest that KNN significantly outperformed the other models. The superior effectiveness of KNN can be attributed to its ability to consider neighbouring traffic conditions, rendering it particularly well-suited for predictions that rely on spatial data.

A key aspect of the study was creating a GIS-based traffic prediction map. By combining GIS with machine learning models, a detailed traffic forecast map was produced, showcasing congestion levels on various city streets. The colour-coded traffic map serves as a practical resource for quickly grasping traffic conditions. It employs a colour scheme to represent different levels of traffic density. Green indicates no traffic, perfect for smooth driving. Yellow shows low traffic, implying that while some vehicles may be present, the flow is largely good. Orange denotes medium traffic, where delays might begin but remain manageable. Red highlights high traffic levels, suggesting possible slowdowns and the need for caution. Lastly, dark red signifies very high traffic, often leading to considerable congestion and extended delays. This system enables drivers to swiftly evaluate the situation and adjust their routes as needed. These maps provide commuters with real-time insights, facilitating the selection of less congested paths and effective trip planning.

Integrating GIS with machine learning presents several significant benefits. To start, it offers a cost-effective strategy by relying on data-driven insights instead of necessitating major infrastructural changes. This approach also supports real-time processing, allowing dynamic updates to traffic maps for effective monitoring. Furthermore, its scalability enables application across various cities with minimal adjustments, making it a versatile option. It also yields substantial environmental advantages by decreasing fuel consumption and air pollution through improved traffic flow optimization. Lastly, it enhances city planners’ decision-making by providing accurate congestion forecasts that can inform proactive traffic management strategies.

The combination of GIS and machine learning, particularly KNN, has shown to be an exceptionally effective method for predicting traffic. The study highlights how models driven by spatial data can transform urban transportation planning. By utilizing predictive traffic maps, municipalities can establish enhanced congestion control measures, allowing commuters to make well-informed travel choices. As urban areas continue to expand, adopting smart traffic management solutions will be crucial for creating sustainable cities. The future of transportation hinges on leveraging the capabilities of big data, machine learning, and GIS to develop smarter and more efficient mobility solutions.

Reference

Babaei, M., & Behzadi, S. (2023). ‏ Spatial Data-Driven Traffic Flow Prediction Using Geographical Information System. Journal of Soft Computing in Civil Engineering, 7(4), 132-143.10.22115/scce.2023.346188.1460

Jafarian H, Behzadi S. Evaluation of PM2.5 Emissions in Tehran by Means of Remote Sensing and Regression Models. Pollution 2020;6:521–9. https://doi.org/10.22059/poll.2020.292065.706.

Zuo W, Zhang D, Wang K. On kernel difference-weighted k-nearest neighbor classification. Pattern Anal Appl 2008;11:247–57. https://doi.org/10.1007/s10044-007-0100-z

Akbari M, Overloop PJ van, Afshar A. Clustered K Nearest Neighbor Algorithm for Daily Inflow Forecasting. Water Resour Manag 2011;25:1341–57. https://doi.org/10.1007/s11269-010-9748-z.

Hlaing SS, Tin MM, Khin MM, Wai PP, Sinha GR. Big Traffic Data Analytics For Smart Urban Intelligent Traffic System Using Machine Learning Techniques. 2020 IEEE 9th Glob. Conf. Consum. Electron., IEEE; 2020, p. 299–300. https://doi.org/10.1109/GCCE50665.2020.9291790

Ghasempoor Z, Behzadi S. Predicting Traffic Data in GIS using Different Neural Network Methods. Int J Geogr Geol 2022;11:62–71. https://doi.org/10.18488/10.v11i2.3166