Aluvial.
Methodology

How We Know the Models Work

Most agricultural AI systems are trained as statistical black boxes: large historical datasets are used to learn correlations between weather, soils, and yield outcomes. Aluvial takes a different approach. Our models are rooted in mechanistic plant science — we simulate the underlying biological and physical processes governing crop growth rather than relying solely on statistical pattern matching.

At the foundation of the platform is a biophysical crop model library calibrated against controlled-environment and field observations across corn, soybean, wheat, canola, camelina, pennycress, and broader oilseed systems.1Jones, J. W. et al. The DSSAT cropping system model. Eur. J. Agron. 18, 235–265 (2003). doi.org/10.1016/S1161-0301(02)00107-72Keating, B. A. et al. An overview of APSIM, a model designed for farming systems simulation. Eur. J. Agron. 18, 267–288 (2003). doi.org/10.1016/S1161-0301(02)00108-9 Weather forcing is derived from ERA5-Land,3Muñoz-Sabater, J. et al. ERA5-Land: A state-of-the-art global reanalysis dataset for land applications. Earth Syst. Sci. Data 13, 4349–4383 (2021). doi.org/10.5194/essd-13-4349-2021 PRISM, Daymet, and WorldClim, while soil constraints are parameterized using SSURGO, gSSURGO, and SoilGrids.4Hengl, T. et al. SoilGrids250m: Global gridded soil information based on machine learning. PLoS ONE 12, e0169748 (2017). doi.org/10.1371/journal.pone.0169748 This physiological foundation enables Aluvial to model Genotype × Environment × Management (G×E×M) interactions5van Ittersum, M. K. & Rabbinge, R. Concepts in production ecology for analysis and quantification of agricultural input-output combinations. Field Crops Res. 52, 197–208 (1997). doi.org/10.1016/S0378-4290(97)00037-3 directly and translate them into deployment intelligence, risk surfaces, and carbon intensity projections.

The underlying failure mode in agricultural commercialization is rarely product efficacy in isolation — it is the Commercial Gap: the disconnect between controlled R&D success and environmental variability under real-world deployment. A biological showing an average 8% yield response may still fail commercially if 30% of target environments fall below the economic response threshold due to soil water limitations, nitrogen availability, or adverse weather realization. Aluvial is designed to quantify that uncertainty before large-scale deployment decisions are made.

30–50 observations achieves stable prediction surfaces — compared to the 200–500 often required for unconstrained machine learning — because physiological priors constrain the hypothesis space before any empirical data is introduced.

Background

A Brief History of Biophysical Crop Modeling

Biophysical crop simulation emerged from the Dutch school of production ecology in the late 1950s, with C. T. de Wit's foundational work on radiation interception, transpiration, and dry matter assimilation establishing the mechanistic basis for whole-plant growth modeling. Ritchie's soil water balance and evapotranspiration formulations6Ritchie, J. T. Model for predicting evaporation from a row crop with incomplete cover. Water Resour. Res. 8, 1204–1213 (1972). doi.org/10.1029/WR008i005p01204 provided the hydrological backbone that subsequent systems built upon.

By the 1980s, the CERES model family codified species-specific phenology and nitrogen cycling within a daily time step. These were integrated into the Decision Support System for Agrotechnology Transfer (DSSAT),1Jones, J. W. et al. The DSSAT cropping system model. Eur. J. Agron. 18, 235–265 (2003). doi.org/10.1016/S1161-0301(02)00107-7 enabling multi-season simulation of cropping sequences under variable management and weather. The Agricultural Production Systems sIMulator (APSIM)2Keating, B. A. et al. An overview of APSIM, a model designed for farming systems simulation. Eur. J. Agron. 18, 267–288 (2003). doi.org/10.1016/S1161-0301(02)00108-9 followed with modular architecture suited to diverse farming systems, while STICS7Brisson, N. et al. An overview of the crop model STICS. Eur. J. Agron. 18, 309–332 (2003). doi.org/10.1016/S1161-0301(02)00110-7 extended the framework to explicit organ-level carbon–nitrogen partitioning.

These systems now form the core infrastructure for global yield gap analyses, climate adaptation studies, and deployment risk quantification — increasingly coupled with remote sensing assimilation, machine learning emulators, and multi-model ensemble uncertainty frameworks.

Model Architecture

Genotype-Aware Parameters in Biophysical Models

The G×E×M framework — formalized by van Ittersum and Rabbinge5van Ittersum, M. K. & Rabbinge, R. Concepts in production ecology for analysis and quantification of agricultural input-output combinations. Field Crops Res. 52, 197–208 (1997). doi.org/10.1016/S0378-4290(97)00037-3 — recognizes that observed yield is the product of genetic potential, environmental expression, and management decisions. In biophysical models, genotype is encoded through cultivar coefficients: discrete parameters specifying photoperiod sensitivity, thermal time requirements for key phenological stages, maximum leaf area expansion, radiation-use efficiency, and grain-filling duration.1Jones, J. W. et al. The DSSAT cropping system model. Eur. J. Agron. 18, 235–265 (2003). doi.org/10.1016/S1161-0301(02)00107-7

Hammer et al.8Hammer, G. L. et al. Models for navigating biological complexity in breeding improved crop plants. Trends Plant Sci. 11, 587–593 (2006). doi.org/10.1016/j.tplants.2006.10.006 demonstrated that cultivar-parameterized models can navigate biological complexity across breeding populations and predict performance in unobserved environments — a capacity that purely statistical approaches cannot replicate without far larger empirical datasets. This extrapolation potential underlies Aluvial's spatial deployment intelligence: coefficients estimated from controlled calibration trials are propagated through high-resolution weather and soil surfaces to generate deployment risk profiles across thousands of geographies.

Advances in genomic prediction have further linked quantitative trait loci (QTL) to model parameters, enabling hybrid approaches that use marker data to inform priors on biophysical coefficients.9Cooper, M. et al. Predicting the future of plant breeding: complementing empirical evaluation with genetic prediction. Crop Pasture Sci. 65, 311–336 (2014). doi.org/10.1071/CP1400710van Eeuwijk, F. A. et al. Modelling strategies for assessing and increasing the effectiveness of new phenotyping techniques in plant breeding. Plant Sci. 282, 23–39 (2019). doi.org/10.1016/j.plantsci.2018.06.018 Genotype-aware models now provide the analytical substrate for connecting breeding pipeline outputs directly to commercial deployment risk.

Stage 1

Aluvial FRAME & Crop Model Library

Evidence Framing

Before running a single simulation, Aluvial FRAME defines the exact commercial decision and the evidence required to support it. Rather than conducting open-ended research, FRAME synthesizes existing trial, field, and environmental data to establish what is known — and what is not — about a product's performance under real deployment conditions. Each product is mapped into a structured Genotype × Environment × Management (G×E×M) coordinate system,5van Ittersum, M. K. & Rabbinge, R. Concepts in production ecology for analysis and quantification of agricultural input-output combinations. Field Crops Res. 52, 197–208 (1997). doi.org/10.1016/S0378-4290(97)00037-38Hammer, G. L. et al. Models for navigating biological complexity in breeding improved crop plants. Trends Plant Sci. 11, 587–593 (2006). doi.org/10.1016/j.tplants.2006.10.006 identifying criteria for ideal responder classes and flagging critical knowledge gaps. This targeted knowledge base directly informs Aluvial's AI pipelines and baseline modeling.

Biophysical Prior

Once framed, this knowledge base feeds Aluvial's BMST — high-performance biophysical crop modeling software that we author and maintain — configured for major commodity grains, oilseeds, and cover crops. This mechanistic representation serves as a biophysical prior17Willard, J. et al. Integrating scientific knowledge with machine learning for engineering and environmental systems. ACM Comput. Surv. 55, 1–37 (2022). doi.org/10.1145/3514228 that enforces known physiological limits before any statistical learning begins, ensuring that deep learning layers are calibrated against residual behavior rather than inferring basic biology from scratch.16Feng, P. et al. Dynamic wheat yield forecasts are improved by a hybrid approach using a biophysical model and machine learning technique. Agric. For. Meteorol. 285–286, 107922 (2020). doi.org/10.1016/j.agrformet.2020.107922

2–3× data compression — by constraining the neural networks with a biophysical prior, Aluvial reduces required field observations from over 200 down to 50–80, substantially improving extrapolation across novel weather patterns, management regimes, and unexplored geographies.17Willard, J. et al. Integrating scientific knowledge with machine learning for engineering and environmental systems. ACM Comput. Surv. 55, 1–37 (2022). doi.org/10.1145/3514228

Data Isolation

Client engagement data is never merged into shared calibration datasets or used to train global models for other accounts. Genotype identity tracking is maintained for all client-supplied materials, and each engagement remains logically isolated. Proprietary observations are returned exclusively within client-specific outputs.

Biophysical Models

The Mechanistic Engine: Physics-Informed Neural Networks

Aluvial's biophysical model library is configured for major row crops and bioenergy feedstocks — corn, soybeans, spring wheat, canola, and camelina. Daily biomass accumulation is simulated using a radiation use efficiency (RUE) framework11Monteith, J. L. Climate and efficiency of crop production in Britain. Philos. Trans. R. Soc. Lond. B 281, 277–294 (1977). doi.org/10.1098/rstb.1977.014012Sinclair, T. R. & Muchow, R. C. Radiation use efficiency. Adv. Agron. 65, 215–265 (1999). doi.org/10.1016/S0065-2113(08)60914-1 — where intercepted photosynthetically active radiation is converted to dry matter at a species-specific efficiency coefficient — while phenological progression is driven by growing degree day (GDD) thermal accumulation.13McMaster, G. S. & Wilhelm, W. W. Growing degree-days: one equation, two interpretations. Agric. For. Meteorol. 87, 291–300 (1997). doi.org/10.1016/S0168-1923(97)00027-0

Biomass production is constrained at each daily timestep by independent, multiplicative physiological stress scalars governing:

  • water availability
  • temperature stress
  • nitrogen limitation
  • phosphorus limitation
  • potassium limitation

Each scalar operates on a strict index between zero and one. This multiplicative formulation ensures that a single severe constraint — such as a flash drought — can realistically suppress the crop's response to otherwise favorable conditions, mirroring yield losses observed under combined water and nutrient stress.14Steduto, P., Hsiao, T. C., Fereres, E. & Raes, D. Crop Yield Response to Water. FAO Irrigation and Drainage Paper No. 66. FAO, Rome (2012). fao.org/3/i2800e/i2800e.pdf Temperature functions also govern both the instantaneous photosynthetic rate and cumulative development, enforcing threshold-based reductions during extreme heat or cold.15Asseng, S. et al. Rising temperatures reduce global wheat production. Nat. Clim. Change 5, 143–147 (2015). doi.org/10.1038/nclimate2470

PINN Architecture

Bridging Biophysics and Deep Learning

The physical equations constrain the hypothesis space to biologically possible outcomes before any field data is introduced, while neural network layers are calibrated against the residual behavior — the complex, non-linear G×E×M interactions5van Ittersum, M. K. & Rabbinge, R. Concepts in production ecology for analysis and quantification of agricultural input-output combinations. Field Crops Res. 52, 197–208 (1997). doi.org/10.1016/S0378-4290(97)00037-38Hammer, G. L. et al. Models for navigating biological complexity in breeding improved crop plants. Trends Plant Sci. 11, 587–593 (2006). doi.org/10.1016/j.tplants.2006.10.006 that the mechanistic system cannot fully explain on its own. This is a Physics-Informed Neural Network (PINN) framework:17Willard, J. et al. Integrating scientific knowledge with machine learning for engineering and environmental systems. ACM Comput. Surv. 55, 1–37 (2022). doi.org/10.1145/3514228 rigid biophysics provides structure, deep learning provides empirical adaptation.

Unconstrained machine learning must simultaneously infer biological structure and environmental response from noisy field observations — requiring datasets that are often unattainable at early product stages. The physics-informed prior eliminates this requirement, compressing the data volume needed for precision by up to 2–3×17Willard, J. et al. Integrating scientific knowledge with machine learning for engineering and environmental systems. ACM Comput. Surv. 55, 1–37 (2022). doi.org/10.1145/3514228 and drastically improving extrapolation to novel geographies and management regimes where purely statistical systems degrade.16Feng, P. et al. Dynamic wheat yield forecasts are improved by a hybrid approach using a biophysical model and machine learning technique. Agric. For. Meteorol. 285–286, 107922 (2020). doi.org/10.1016/j.agrformet.2020.107922

Stage 2

Aluvial EXECUTE : Micro Infrastructure for High-Resolution Calibration

Traditional field-trial programs often require multiple growing seasons and millions in capital to separate a product's true treatment effect from random environmental noise.21Piepho, H.-P., Möhring, J., Melchinger, A. E. & Büchse, A. BLUP for phenotypic selection in plant breeding and variety testing. Euphytica 161, 209–228 (2008). doi.org/10.1007/s10681-007-9449-8 Aluvial bypasses this bottleneck through its Micro infrastructure — a proprietary Biophysical Calibration Loop that feeds the modeling stack and allows complex Genotype × Environment × Management (G×E×M) interactions5van Ittersum, M. K. & Rabbinge, R. Concepts in production ecology for analysis and quantification of agricultural input-output combinations. Field Crops Res. 52, 197–208 (1997). doi.org/10.1016/S0378-4290(97)00037-3 to be isolated and quantified much earlier in the product lifecycle — not as a traditional testing service, but as a structured calibration input.

High-Resolution Data Capture

Rather than conducting open-ended research, Aluvial executes bounded Genotype-Conditional Validation Sprints. Greenhouse and growth chamber programs generate 3–4 rapid, replicated calibration cycles annually, capturing the sub-daily environmental covariates that traditional field observations often miss:22Poorter, H. et al. Pot size matters: a meta-analysis of the effects of rooting volume on plant growth. Funct. Plant Biol. 39, 839–850 (2012). doi.org/10.1071/FP12049

  • temperature profiles
  • vapor pressure deficit (VPD)
  • photosynthetically active radiation (PAR)
  • soil moisture dynamics

Disentangling Signal from Noise

Trials are executed using strict factorial designs against immutable genotype panels spanning core reference lines, client germplasm, and exploratory material. Controlled nitrogen gradients and environmental stress manipulations — simulated flash droughts, heat spikes — establish precise biophysical response baselines.23Skirycz, A. & Inzé, D. More from less: plant growth under limited water. Curr. Opin. Biotechnol. 21, 197–203 (2010). doi.org/10.1016/j.copbio.2010.03.002 This isolates the biological treatment effect, disentangling a product's true performance from random weather realizations and environmental covariance.21Piepho, H.-P., Möhring, J., Melchinger, A. E. & Büchse, A. BLUP for phenotypic selection in plant breeding and variety testing. Euphytica 161, 209–228 (2008). doi.org/10.1007/s10681-007-9449-8

Accelerating the Go / No-Go Decision

For biological input manufacturers and seed developers, this shortens the path from exploratory screening to validated deployment intelligence. Instead of waiting 2–3 seasons for noisy field-trial data to accumulate, calibrated response surfaces begin forming within a single growing season. The output is decision-grade intelligence: either a high-value “Fast No” that halts wasted field spend on non-transferable signals, or explicit responder classifications that de-risk product launches before broad field budgets are committed.

Stage 3

Aluvial EXECUTE : Physics-Informed, Macro Scale Analysis

To translate calibrated crop behavior into actionable spatial deployment intelligence, Aluvial's Macro engine is powered by a Physics-Informed Hierarchical Modeling (PIHM) framework17Willard, J. et al. Integrating scientific knowledge with machine learning for engineering and environmental systems. ACM Comput. Surv. 55, 1–37 (2022). doi.org/10.1145/3514228 that systematically bridges the gap between controlled R&D data and heterogeneous field variability. At its base sits the mechanistic biophysical prior from BMST; above it, a Bayesian hierarchy pools observations across environments to estimate genotype- and management-conditional residual behavior.19Malosetti, M., Ribaut, J.-M. & van Eeuwijk, F. A. The statistical analysis of multi-environment data: modelling genotype-by-environment interaction and its genetic basis. Front. Physiol. 4, 44 (2013). doi.org/10.3389/fphys.2013.00044

Three-Level Bayesian Hierarchy

The hierarchy scales evidence from controlled environments to commercial geographies through three progressively richer levels:

Level 1 — Foundation Weights

Establishes the baseline environmental fingerprint using large-scale climate and soil context — encoding the prior expectation of crop behavior before any client data is introduced.3Muñoz-Sabater, J. et al. ERA5-Land: A state-of-the-art global reanalysis dataset for land applications. Earth Syst. Sci. Data 13, 4349–4383 (2021). doi.org/10.5194/essd-13-4349-2021

Level 2 — Micro-Calibrated Posteriors

Integrates controlled greenhouse data from Aluvial EXECUTE to establish genotype-conditional response functions and treatment effects under defined stress conditions, updating the posterior with high-signal, low-noise observations.19Malosetti, M., Ribaut, J.-M. & van Eeuwijk, F. A. The statistical analysis of multi-environment data: modelling genotype-by-environment interaction and its genetic basis. Front. Physiol. 4, 44 (2013). doi.org/10.3389/fphys.2013.00044

Level 3 — Macro Field Observations

Pools field-level observations and in-season weather realizations to update the prior structure, capturing how treatments perform across complex, real-world Genotype × Environment × Management (G×E×M) interactions5van Ittersum, M. K. & Rabbinge, R. Concepts in production ecology for analysis and quantification of agricultural input-output combinations. Field Crops Res. 52, 197–208 (1997). doi.org/10.1016/S0378-4290(97)00037-3 at commercial scale.

High-Resolution Spatial Intelligence

Spatial outputs are generated at multiple resolutions tailored to commercial deployment scope: approximately 4 km continental US coverage using PRISM climate surfaces,18Daly, C. et al. Physiographically sensitive mapping of climatological temperature and precipitation across the conterminous United States. Int. J. Climatol. 28, 2031–2064 (2008). doi.org/10.1002/joc.1688 field-scale resolution using gSSURGO soil constraints, and approximately 9 km global analyses using ERA5-Land.3Muñoz-Sabater, J. et al. ERA5-Land: A state-of-the-art global reanalysis dataset for land applications. Earth Syst. Sci. Data 13, 4349–4383 (2021). doi.org/10.5194/essd-13-4349-2021 This allows the Macro engine to map responder terrains and identify geographies where yield, oil content, and profitability thresholds clear together.

Probabilistic Risk Outputs (P10 / P50 / P90)

Deterministic point estimates are insufficient for commercial risk decisions. All Macro spatial outputs are delivered as probabilistic surfaces20Asseng, S. et al. Uncertainty in simulating wheat yields under climate change. Nat. Clim. Change 3, 827–832 (2013). doi.org/10.1038/nclimate1916 quantifying yield response, carbon intensity, and downside risk as explicit P10/P50/P90 confidence distributions. This equips procurement and commercialization teams with a quantified view of worst-case exposure before capital or field budgets are committed.

The Compounding Data Advantage

Because the hierarchical Bayesian models pool cross-trial data, each additional calibration cycle tightens uncertainty bounds for the target geography — physically sharpening parameter estimates across the entire crop system.19Malosetti, M., Ribaut, J.-M. & van Eeuwijk, F. A. The statistical analysis of multi-environment data: modelling genotype-by-environment interaction and its genetic basis. Front. Physiol. 4, 44 (2013). doi.org/10.3389/fphys.2013.00044 A third deployment cycle within a region produces materially narrower confidence intervals than an initial engagement over the same landscape — measurable uncertainty reduction driven by accumulated evidence, not cosmetic smoothing.

Stage 4

Carbon Intensity Translation

Biophysical outputs feed directly into Aluvial's carbon intensity (CI) translation engine — a directed acyclic graph (DAG) representing the full causal pathway from field management decisions to grams of CO₂-equivalent per megajoule (gCO₂e/MJ). Nitrogen inputs, yield response, oil content, and soil carbon dynamics each contribute independently attributed uncertainty to the final CI distribution.

Rather than returning single CI values, Aluvial produces probabilistic CI distributions (P10/P50/P90) reflecting uncertainty from weather realization, yield variability, management uncertainty, and emissions-factor variance. Regulatory frameworks — GREET, CA-GREET, CORSIA, and EU RED III — are parameterized dynamically at query time, so the same simulation outputs support feedstock sourcing, sustainable aviation fuel evaluation, and international compliance workflows without model re-execution.

The system additionally computes CI-at-Risk (CIaR) — a downside-risk metric quantifying worst-case CI exposure under adverse environmental realizations. Additionality is derived through counterfactual comparison against regulatory baselines, enabling direct estimation of eligibility under 45Z (IRA §13204) and EU RED III additionality provisions. For recurring MRV workflows, Aluvial integrates Sentinel-1 and Sentinel-2 anomaly detection to validate that claimed management practices occurred spatially and temporally as reported.

Application

What This Means for Feedstock Programs

For feedstock developers and refiners, the relevant question is rarely “what is the expected CI score?” The operational question is what happens under adverse conditions. A county-level CI screen therefore evaluates not only expected CI performance, but the probability distribution surrounding that outcome.

In many procurement environments, the downside tail under drought-year conditions is more important than the mean estimate itself. This allows supply sheds to be evaluated not only for expected profitability, but for regulatory resilience under stress-year weather realizations.

Live Data

Explore the Data

Sampled records from Aluvial's biophysical models showing where yield signals are present across commodity and oilseed systems.

Live biophysical model output

Every major crop, every field, every year — all consistently modeled

Sampled records from Aluvial’s biophysical models showing where yield signals are present across commodity and oilseed systems.

Observations
0
Crops live
0
Metric
Yield bu/ac
Loading biophysical model yield points...
The Compounding Advantage

Data That Compounds Across Programs

Every engagement contributes to tighter G×M response surfaces across the system. Genotype identity is treated as persistent and immutable throughout the modeling stack. Observations from a biological trial in Kansas may improve CI uncertainty estimates for the same genotype in Nebraska. Oilseed deployment programs sharpen biological responder classification. Sequential calibration cycles narrow posterior uncertainty and reduce the effective noise floor surrounding future deployment decisions.

Over time, this creates a compounding intelligence effect: every validated observation improves the precision of subsequent analyses across connected deployment pathways. Static models do not improve this way. Aluvial's framework does.

Technical Validation Engagements

Most technical validation programs begin with a Performance Intelligence engagement:

  • 1–3 crop systems
  • 5–8 target geographies
  • deployment suitability analysis
  • uncertainty characterization
  • go/no-go recommendations

Initial technical assessments are typically delivered within 4–6 weeks following data onboarding.

Aluvial | Physics Powered Crop Intelligence

Stop buying software. Start buying decisions.

Aluvial | Physics Powered Crop Intelligence