In the race for higher yields and lower carbon intensity, data is the new feedstock. While traditional HPLC provides a snapshot of the past, Keit’s IRmadillo spectrometer offers a continuous stream of real-time chemical data. But the true power of this technology lies in how that data can be used to build machine learning (ML) models that don't just monitor the process, but actively optimize it.
Here is how you can use Mid-IR spectroscopy data to build a smarter, predictive production model.
Any robust machine learning model requires high-quality training data. The IRmadillo installs directly into the process lines—whether in liquefaction, propagation, fermentation, or distillation—and simultaneously measures multiple chemical species, including Ethanol, Sugars (DP1-DP4+), Lactic Acid, Glycerol, and Nitrogen (PAN/FAN).
Unlike daily lab samples, this generates a dense, continuous dataset. This "live feed" allows you to capture the subtle dynamics of process changes that infrequent sampling misses, creating a rich historical database essential for training predictive algorithms.
Machine learning models can be trained, either from the calculated concentration values that the IRmadillo is usually calibrated to produce, or directly from the raw spectra. Changes in the spectra reflect the concentrations of DP$, ethanol, lactic etc, but they also contain far more information than that. The ML model-building uses all the relevant data available in the spectra to build a more robust model.
Using the calibrated concentration data enables the user to determine what failure mode is in play and can thus be useful when determining what remedial actions to employ, but, once calibrated, the concentration data has reduced value as the model will simply be able to correlate spectral variation with specific failure modes. In this way, the model could identify “lactobacillus infection” without directly measuring lactic acid growth.
The ultimate goal of this model is actionable control. With real-time insights, the model can drive decisions that directly impact the bottom line:
By transforming raw spectral data into a predictive machine learning model, producers can move from reactive troubleshooting to proactive optimization—securing higher yields and a lower Carbon Intensity (CI) score.