The evolution of machine learning presents a robust, data-driven methodology for identifying correlations among datasets, thereby enhancing the accuracy of estimations. Machine learning models have the capability to reveal non-linear correlations and relationships that traditional methods may overlook. By learning from the data, these algorithms identify patterns and relationships that are not readily apparent through conventional analytical techniques. The strength of these models lies in their ability to generalise
from the data they have processed, enabling them to predict new data with a high degree of accuracy, particularly in determining the permeability coefficient for various soil types.
In addition to its applications in geotechnical engineering, machine learning is being increasingly utilized across various fields within civil engineering. One of the key advantages of machine learning lies in its capacity to adapt to evolving scenarios. Furthermore, the integration of machine learning techniques serves to complement and enhance traditional engineering knowledge rather than replace it.
A notable study conducted by Justyna Dzieciol at the Institute of Civil Engineering, Warsaw University of Life Sciences (SGGW), investigated the prediction of the coefficient of permeability using several advanced methodologies, including artificial neural networks (ANN), random forest (RF), gradient boosting (GB), and linear regression (LR). The research incorporated multiple input parameters, such as volumetric density, porosity, index porosity, grain size curvature, and the homogeneity index. The dataset
comprised 261 samples, which were divided into 70% for training and 30% for testing. To ensure the robustness of the model, ten-fold cross-validation, a resampling technique for evaluating machine learning models with limited data, was employed for validation purposes.
The performance of the various predictive models was assessed and compared using the R-squared statistic, which indicates the proportion of variance in the dependent variable that is predictable from the
independent variables. In this evaluation, the R-squared value for the linear regression model was found to be 0.801, suggesting that approximately 80.1% of the variability in the outcome can be explained by the model. The artificial neural network (ANN) demonstrated a slightly lower R-squared value of 0.8, indicating it explained about 80% of the variance.
In contrast, both the random forest and gradient boosting models displayed notably higher R-squared values, with the random forest model achieving an impressive score of 0.993, meaning it accounted for 99.3% of the variability. The gradient boosting model performed even better, with an R-squared value of 0.995, indicating that it effectively explained 99.5% of the variance in the data.
These results clearly indicate that the gradient boosting model outperformed all other models in this analysis, showcasing its superior predictive capabilities in comparison to linear regression, ANN, and random forest techniques.
Machine learning techniques have great potential for predicting the permeability coefficient in civil engineering. By utilizing large datasets and complex algorithms, these methods offer more accurate predictions than traditional empirical formulas. This prediction is essential for groundwater flow analysis, soil stabilization, and geotechnical applications. However, further research is needed to validate the reliability of these models in real-world scenarios. The choice of algorithm depends on data characteristics, performance expectations, and available resources. For the analyzed soils, the gradient boosting algorithm showed the highest predictive efficiency with a laboratory match of 0.995. To generalize results, broader datasets and a larger variety of materials should be examined.
Reference
Dzięcioł, J. (2024). Machine Learning in Civil Engineering – on the example of prediction of the coefficient of permeability. Acta Scientiarum Polonorum. Architectura, 22, 184–191. https://doi.org/10.22630/ASPA.2023.22.18