If you regularly take the train, you probably don’t give much thought to dirt. You might find yourself glancing out at the passing countryside, scrolling through your phone, or enjoying your coffee. Chances are, you’re not pondering the fact that you’re traveling across a landscape supported by millions of tons of soil, gravel, and clay.
But if you are a geotechnical engineer overseeing a high-speed rail corridor stretching from Moscow to Kazan, dirt is all you think about. Specifically, you are thinking about what happens to that dirt when the temperature drops to -5°C, the groundwater rises, and the ground decides it wants to lift the tracks three inches higher than they were yesterday. This phenomenon is called frost heave, and it is one of the most expensive, annoying, and dangerous geological hazards in the world. In the United States alone, damage related to expansive (heaving) soils costs billions of dollars annually.
For decades, the only way to know if a soil was going to heave was to dig it up, bring it back to a lab, freeze it, and measure how much it moved. This process takes weeks, costs tens of thousands of dollars per season, and often requires a team of engineers with clipboards and sieves. But a groundbreaking new study published in 2025 changes the game entirely. Researchers have successfully developed an AI-driven spectral analysis system that can predict soil heaving potential in seconds. It doesn’t dig. It doesn’t wait. It listens.
Let’s break down how this works, why it matters, and what it means for the future of infrastructure.
The Silent Battle Beneath the Train Tracks
Consider a standard railway embankment, where beneath the steel rails and wooden ties rests a meticulously compacted layer of soil. In an ideal scenario, this soil remains completely stable. However, in reality, particularly in temperate continental climates such as central Russia, the onset of winter can create significant challenges.
As temperatures drop, moisture contained within the soil begins to freeze. It is important to note that water expands by approximately 9% when it transitions into ice. This process of expansion exerts pressure on soil particles, leading to a phenomenon known as frost heave, which can have critical implications for railway infrastructure.
Soils exhibit varying behaviors based on their composition. Sand, for instance, has a rapid drainage capacity, allowing water to pass through it before it has the opportunity to freeze. In contrast, clay operates similarly to a sponge. Its fine particles retain water, holding it securely against gravitational forces. When this moisture freezes, the clay expands significantly, exerting considerable pressure that can lead to structural damage, such as cracking in concrete foundations and warping of railway tracks.
The study focuses on the prospective Moscow-Kazan railway corridor, a region characterized by seasonal freezing and a variety of soil types. The researchers selected two distinct sites:
- Site 1 (Moscow region): Primarily sands and sandy loams with low heaving potential.
- Site 2 (Republic of Tatarstan): Clays and loams with high water retention and significant frost susceptibility.
Historically, the process of assessing soil conditions at construction sites involved several time-consuming steps. These included drilling boreholes, extracting intact soil samples known as “monoliths,” encasing them in wax, and transporting them to a laboratory for analysis. The analysis typically involved subjecting the samples to freeze-thaw cycles, a process that could take upwards of a month.
The Challenge: Unfortunately, by the time the lab results were received, construction crews had often already progressed to the next section of the project.
The Approach: To address this inefficiency, we propose a shift in strategy: rather than waiting for the soil to undergo freeze-thaw cycles, we will utilize alternative techniques to directly assess soil conditions in real-time.
The Physics of “Listening” to Dirt
At first glance, the idea of using sound to analyze soil seems almost mystical. But the physics is surprisingly intuitive.
Think of tapping a wine glass. A fine crystal glass rings with a clear, high-pitched tone. A cracked glass thuds. A plastic cup barely makes a sound. The material’s internal structure, its density, its cracks, its moisture directly dictates the frequency and speed of the sound wave passing through it.
Soil is the same way.
The researchers employed a technique called Non-Destructive Spectral Analysis (NDSA) . They placed soil samples in cylindrical plastic flasks and attached piezoelectric sensors to the sides and top. These sensors generate high-frequency ultrasonic pulses (ranging from 20kHz to 120kHz) that travel through the soil.
Here is where it gets technical, but stay with me:
- Longitudinal waves (P-waves) compress and expand the soil like a slinky. They travel fast and tell us about the soil’s bulk density and elasticity.
- Transverse waves (S-waves) shake the soil side-to-side. They travel slower and tell us about the soil’s shear strength—how well the particles hold together.
By measuring the delay between when the pulse is sent and when it is received (a matter of microseconds), the system calculates the velocity of these waves.
The Golden Rule:
- High velocity = Dense, stiff, frozen soil (low heave risk).
- Low velocity = Loose, soft, thawed soil (potential heave risk).
Velocity alone is insufficient to fully understand the complex interactions between sound waves and soil characteristics. Therefore, the researchers recognized the need for a crucial additional parameter: the relative acoustic compressibility coefficient, denoted as (β).
To conceptualize β, imagine it as an indicator of how much the soil deforms, or “squishes,” in response to acoustic pressure. Different types of soil exhibit varying degrees of compressibility; for instance, sandy soils tend to have a low compressibility, meaning they offer less resistance to deformation when subjected to sound waves. In contrast, clay soils demonstrate high compressibility, reflecting a greater capacity for deformation due to their intricate particle structure and higher water retention.
To achieve their goal, the researchers employed advanced signal analysis techniques, particularly the Fast Fourier Transform (FFT). This method allowed them to dissect the frequency spectrum of the acoustic signals that were returned after interacting with the soil. Through this analysis, they identified distinct “acoustic windows” that correspond to various soil types, revealing unique resonant frequencies indicative of each type’s physical properties.
For sandy soils, the FFT analysis revealed multiple resonant peaks at frequencies of 25 kHz, 42 kHz, and 72 kHz. The broad nature of these peaks suggests that sandy soils have high energy dissipation characteristics, meaning sound waves can travel through them with relative ease, escaping efficiently and resulting in a less pronounced acoustic response.
Conversely, the analysis showed that clay soils exhibit sharply defined resonant peaks located in lower frequency ranges. This phenomenon indicates that clay soils possess a tighter bonding between particles, which contributes to their higher compressibility and enhanced water retention capabilities. Overall, the differences in acoustic signatures provide valuable insights into the mechanical behavior and physical composition of various soil types, allowing for a deeper understanding of their interaction with sound energy.
The AI Agent: Not Just a Model, A “Colleague”
This study marks a significant departure from conventional machine learning research methodologies. Rather than simply developing a predictive classifier, the research team has engineered what they refer to as an Agentic AI.
But what exactly sets this apart from traditional models? Let’s break it down.
A standard machine learning model typically functions by making predictions based on the data it receives; you input a dataset, and it processes this information to yield a numerical output.
In contrast, an AI Agent takes this a step further. This type of AI is characterized by its ability to take autonomous actions. It not only collects and analyzes data on its own but also communicates with other systems to optimize its performance. Moreover, it has the capability to interpret its outputs independently and make decisions without the need for human oversight.
To implement this innovative approach, the researchers developed a framework known as the SHP-Agent (Soil Heaving Predictor Agent). This framework is specifically designed to enhance the capabilities of AI in predicting soil heaving, showcasing the practical applications of an Agentic AI in complex scenarios. With SHP-Agent, the boundaries of machine learning are pushed further, emphasizing the importance of autonomy and decision-making in artificial intelligence.
The Architecture: A Hybrid Mind
The AI agent doesn’t rely on a single algorithm. It utilizes a “hybrid” architecture combining three distinct models, each chosen for a specific reason:
- Convolutional Neural Network (CNN)
- Job: The “Eyes.”
- Input: Spectral images (waveforms converted into visual maps).
- Why? CNNs are exceptional at recognizing spatial patterns. Just as a CNN can look at a photo of a cat and identify whiskers, this CNN looks at a soil spectrum and identifies resonant frequencies associated with clay content.
- Equation: The CNN uses 3D convolution to process spectral data, temperature, and moisture simultaneously. It reduces noise through max-pooling (essentially zooming out to see the big picture).
- Support Vector Machine (SVM)
- Job: The “Classifier.”
- Input: Features extracted by the CNN, plus physical parameters (density, clay content, elastic moduli).
- Why? SVMs are excellent at drawing boundaries. In high-dimensional space, the SVM draws a “line” between “Non-Heaving” and “Extra High-Heaving” soils.
- Output: H(Class) ∈ {0,1,2,3,4} (0 = no heave, 4 = catastrophic heave).
- Random Forest (RF)
- Job: The “Regressor.”
- Input: Same as SVM.
- Why? Heaving isn’t just a category; it’s a measurement (millimeters of lift). Random Forest aggregates hundreds of decision trees to predict the exact deformation value, \( H_{def} \).
- Trick: The researchers also used the Random Forest’s bootstrapping capability to generate 700 synthetic samples to supplement their 300 real experiments. This isn’t “making up data”—it’s statistically interpolating between known points to cover rare scenarios (e.g., extreme cold + high moisture + high clay).
4. The Fusion:
The final prediction is determined through a weighted voting approach, balancing the contributions of different models. For classification tasks, a 50% contribution from Support Vector Machines (SVM) is combined with a 50% contribution from Random Forest (RF), ensuring a well-rounded decision. In the case of regression related to deformation, the weights shift to a 30% contribution from SVM and a more substantial 70% from RF. This strategic weighting guarantees that if one model exhibits uncertainty, the other model can step in to provide more reliable predictions, enhancing overall accuracy and robustness in the results.
The “A2A” Protocol: Talking to Other Robots
One of the most advanced and forward-thinking elements of this study is the implementation of the Agent-to-Agent (A2A) communication protocol. This protocol enables seamless interaction between various agents in a way that enhances their operational efficiency and collaborative capabilities.
Currently, the SHP-Agent, or Soil Health Protocol Agent, operates locally on a user’s personal computer. However, as we look towards the future, the infrastructure we envision will not depend on isolated systems; rather, it will feature interconnected agents that share information and respond dynamically to environmental changes. Picture the following scenario:
The ENV-Agent (Environmental Monitor) is actively observing climatic conditions and identifies an impending cold front, which is accompanied by a significant drop in soil temperature. Its sensors detect the temperature beginning to decrease.
In response to this critical data, the ENV-Agent sends a message to the SHP-Agent: “Attention: Temperature is projected to drop to -3°C. Please proceed with checking soil moisture levels to assess potential impacts.“
Upon receiving this alert, the SHP-Agent swiftly activates its Acoustic sensors to gather additional data. It then performs a comprehensive spectral scan to analyze changes in soil conditions. After processing this information, it calculates a heaving risk, categorizing it as “High.”
With this assessment in hand, the SHP-Agent promptly prepares a JSON packet, a structured data format that facilitates communication with other agents and sends it to the INFRA-Agent (Infrastructure Response Agent). The message reads: “Risk rating 4. Recommend implementing speed restrictions on Section B-12 of the roadway to ensure safety.”
This scenario is not merely a theoretical exercise; it represents the capabilities of functioning software designed to enhance operational responses in real-time. The integration of structured messaging through the A2A protocol allows these agents to be platform-agnostic, meaning they can operate effectively across various operating systems, including Windows, Linux, or even within cloud-based container environments. This level of interoperability is crucial for the future of automated systems, where efficiency and adaptability are paramount.
The Results: How Accurate Is It?
The results of the evaluation highlight the impressive accuracy of the AI agent compared to traditional laboratory methods and on-site monitoring data. The AI demonstrated a remarkable 92% accuracy in classification tasks, indicating its reliability in making precise identifications.
Specifically, its F1-score for “failure” detection stood at 0.91, underscoring its capability to consistently and accurately identify failures, which is highly dependable in practical applications. Additionally, when examining regression metrics, the AI achieved an R² value of 0.91, closely aligning with the 0.93 attained through conventional lab testing, signaling its effectiveness in data predictions.
From an economic perspective, the AI agent presents significant cost advantages over traditional methods. While a traditional lab campaign can range from approximately $12,000 to $16,000 and takes about 34 to 38 days to complete, on-site monitoring is even more costly, averaging between $14,000 and $20,000 with a prolonged timeline of 1 to 2 years.
In sharp contrast, the setup cost for the AI agent, which includes an oscilloscope, sensors, and a PC, totals around $4,400, with ongoing operational costs estimated at around $1,500. This stark difference in expenditure shows the potential for substantial savings. Furthermore, the AI system simplifies the process by eliminating the need for a soil scientist to interpret complex Atterberg limits or for a data engineer to clean and prepare CSV files. Instead, it only necessitates the expertise of a “prompt engineer” to oversee the system’s coordination, making it an efficient and accessible solution for various applications.
The Acoustic Signatures: Why Clay “Sounds” Different
The study yielded a captivating visualization i.e. Non-linear regression surfaces. When frequency is plotted against density and acoustic compressibility (β), it reveals a 3D terrain map.
- Sand: It’s as flat as a pancake. Variations in density have little impact on sound. The large, rigid sand particles allow sound waves to glide through the air gaps with minimal resistance.
- Sandy Loam: It features a gentle slope. The addition of silt-sized particles introduces some friction, altering the sound dynamics.
- Loam: This layer creates a curved surface. The combination of sand, silt, and clay forms a complex matrix that scatters, absorbs, and reflects sound waves.
- Clay: Here, we encounter a steep cliff. Clay particles, which are plate-like and electrically charged, bond with water molecules, resulting in a viscous, plastic medium. While sound travels faster through these water films, the energy loss, or attenuation, is significant.
Why This Matters for Your Morning Commute
It’s easy to dismiss soil science as a niche discipline. But infrastructure is invisible until it fails.
In 2023, extreme temperatures caused train tracks to buckle in the UK, stranding thousands of passengers. In the US Midwest, frost heave creates “railroad bumps” that force freight trains to slow down, increasing shipping costs for everything from grain to automobiles.
By deploying AI agents like SHP-Agent, railway operators can:
- Shift from Reactive to Predictive: Stop fixing heaving damage; prevent it by identifying high-risk zones before winter.
- Reduce Carbon Footprint: Less heavy machinery drilling boreholes, fewer trucks hauling samples.
- Increase Safety: Real-time risk scores can be fed directly into train control systems, automatically reducing speeds when the ground becomes unstable.
Conclusion
We are entering an era where the ground beneath our feet is becoming “smart.” Not because the soil itself has changed, but because our tools have evolved.
This research proves that with a relatively modest investment in hardware and a sophisticated AI agent, we can extract decades of geotechnical experience and encode it into a machine that works 24/7, never gets tired, and never misses a resonance peak.
The next time you hear the rhythmic clatter of a train on the tracks, remember: beneath the steel and stone, there might be an AI agent listening. It’s checking the pulse of the earth, ensuring the path ahead is stable, and quietly preventing a disaster you’ll never know almost happened.

Reference
Zaitsev, A., Koshurnikov, A., Gagarin, V. et al. AI-driven spectral analysis of soil heaving for automated surveys in rail transport infrastructure. AI Civ. Eng. 4, 23 (2025). https://doi.org/10.1007/s43503-025-00072-8

