A study conducted by a team of researchers from Gebze Technical University in Gebze-Kocaeli, Turkey, focused on landslide susceptibility mapping in the Yenice district of Karabük. The researchers employed various methods, including Canonical Correlation Forest (CCF), Random Forest (RF), Rotation Forest (RotFor), and Logistic Regression (LR). The study area, which spans 678 km², is located along the western Black Sea coast of Turkey and is frequently affected by landslides.
This region is characterized by highly mountainous terrain with steep hillside slopes and flat valleys. Its physiographic, land, and climatic characteristics, including above-average rainfall and specific soil properties, contribute to its vulnerability to landslide activity. The mean annual precipitation in the area is recorded at 1,200 millimeters.
Landslides are among the most concerning and frequently occurring natural disasters, particularly in hilly and mountainous regions. These events can result in significant loss of life, property damage, and disruption of infrastructure, making it imperative to effectively manage and mitigate their impacts. One of the key strategies for addressing landslide risk is the gathering of accurate data on the proportion and spatial distribution of areas that are susceptible to such events.
Creating comprehensive landslide susceptibility maps is essential for this purpose. These maps categorize different regions based on their likelihood of experiencing a landslide, employing various criteria such as geological composition, slope steepness, vegetation cover, and rainfall patterns. By dividing the landscape into zones reflecting varying levels of severity, these maps provide critical information that aids in informed decision-making.
With the insights gained from landslide susceptibility mapping, planners and local authorities are better equipped to implement effective land use strategies. This proactive approach allows for the development of policies that minimize risks, such as restricting construction in high-risk zones, enhancing drainage systems, and establishing early warning systems. Ultimately, a thorough understanding of landslide susceptibility not only facilitates disaster preparedness and response but also promotes sustainable development and hazard mitigation practices in vulnerable areas.
Numerous studies have explored techniques such as heuristic, statistical, and computational methods. Among these, computational techniques have demonstrated superior generalization capabilities, particularly for addressing non-linear problems. As a result, the application of machine learning techniques in landslide analysis has seen significant growth. Furthermore, with advancements in machine learning, the popularity of ensemble techniques has surged due to their ability to yield better results compared to standalone models.
Therefore, the researchers employed ensemble models in their research by examining 159 polygons, encompassing 27294 landslide pixels, with each pixel representing a 30×30 meter area on the ground. The landslide conditioning factors included in the analysis are lithology, land use and land cover (LULC), topographic wetness index (TWI), elevation, normalized difference vegetation index (NDVI), aspect, drainage density, and slope. The dataset was divided into 70% for training and 30% for testing purposes.
To evaluate the density of landslide versus non-landslide areas, the frequency ratio method was applied. This involved calculating the density as the ratio of the area covered by pixels representing either landslides or non-landslides within each specific factor class, relative to the total area of that class. This process was repeated for each factor available. The estimated ratios were then combined, and each ratio was divided by the overall sum to determine the comprehensive landslide density.
For the analysis, the TreeBagger function was employed to develop the Random Forest (RF) and Rotation Forest (RotFor) models, while an open-source code was implemented for the Canonical Correlation Forest (CCF) within the MATLAB environment. The evaluation metrics indicated that the estimated AUC values for the LR, RF, RotFor, and CCF models were 0.826, 0.982, 0.966, and 0.970, respectively. These results demonstrate that the RF model outperformed the other models in this analysis.
Additionally, success rate curves were calculated for various susceptibility models, including RF, CCF, RotFor, and LR. The analysis showed that the first 10% of the susceptibility classes accounted for approximately 53%, 48%, 52%, and 42% of existing landslides for each respective model. Moreover, extending the analysis to cover 30% of the area revealed that the models indicated 82%, 81%, 79%, and 75% of the highly affected areas corresponded to existing landslides. Notably, 82% of identified landslides were located within the 100-70% high susceptibility classes when using the RF method. Overall, all models demonstrated strong performance, with ensemble-based algorithms consistently surpassing the LR model.
In addition to the previously mentioned evaluations, a non-parametric statistical test called Wilcoxon’s signed-rank test was conducted to compare the significance of differences in model performances. The results indicate that there were statistically significant differences in performance between the logistic regression (LR) model and the ensemble models (RF, CCF and RotFor).
In recent years, machine learning and data mining algorithms have become popular for their high accuracy and low computational costs, effectively addressing complex structured modelling challenges like LSM. Since many advanced algorithms are not available in GIS environments like ArcGIS, researchers often use statistical software such as Matlab for implementation. In this process, GIS tools prepare the input data, while statistical software handles the modelling and predictions, thereby optimizing the overall workflow.
Reference
Sahin, E. K., Colkesen, I., & Kavzoglu, T. (2020). A comparative assessment of canonical correlation forest, random forest, rotation forest and logistic regression methods for landslide susceptibility mapping. Geocarto International, 35(4), 341-363.