New tool accelerates identification of most promising fuels for reducing emissions & improving engine efficiency; Co-Optima
A new machine-learning tool has significantly speeded up calculating the thermodynamics of chemical reactions, making it possible to identify the most promising fuels for lower emissions and greater engine efficiency in seconds.
Developed as part of the U.S. Department of Energy’s (DOE’s) Co-Optimization of Fuels & Engines (Co-Optima) initiative (earlier post), A machine-Learning derived, Fast, Accurate Bond dissociation Enthalpy Tool (ALFABET) makes it possible for researchers to identify the most promising fuels for lower emissions and greater engine efficiency in seconds rather than days. An open-access paper on the work is published in Nature Communications.
Bond dissociation enthalpy (BDE) is the energy required to break a chemical bond between atoms in organic compounds. Knowing the BDE of all the bonds in a molecule allows researchers to predict its chemical reactions in order to determine its suitability for certain uses, such as in polymers or fuels including biofuels. However, to calculate the BDEs using traditional quantum mechanics methods requires resource-intensive simulations.
ALFABET significantly decreases the amount of time, and therefore the computational costs, of calculating the energy required to break chemical bonds, and closely matches the accuracy of the traditional density-functional theory method. Built at the National Renewable Energy Laboratory, ALFABET is freely available via an interactive website.
With the modern machine-learning tools we’ve developed specifically for quantum chemical applications, we can get results of comparable accuracy quickly enough to be embedded in the ALFABET interactive website.—Peter St. John, lead author
a) Size of key elements of BDE-db.
b) Indexing and calculation of a single BDE reaction. For a given cleaved bond, SMILES strings of the parent molecule and two resulting radicals are passed for DFT optimization.
c) Learning curve for the model, plotting MAE (in kcal mol−1) on dev set BDEs against the number of molecules included in the training database. Both x and y axes are log-scaled, and error bars indicate standard deviation between three replicates.
d) 2D representations of bond embeddings are shown via the t-SNE algorithm after the first, third, and final message passing layers. Initially, bonds of similar classes are clustered close together in embedding space. For deeper layers of the model, representations of the bonds become more detailed as they represent its specific local environment.
e) Structure of the GNN. Atom and bond state vectors are updated through a series of six message passing blocks. The final embedding layer is then used to predict the BDE of each bond. St. John et al.
Nearly all chemical reactions of organic compounds involve the breaking and forming of covalent bonds. The cumulative difference between BDE values of all bonds broken and formed in a chemical reaction provides an estimate of the overall reaction enthalpy.
The commonly used method to calculate the BDE of a molecule is through density-functional theory (DFT). But, as the researchers noted, this process is tedious, computationally expensive, and presents many opportunities to make calculation errors that result in large deviations from experimental measurements.
Further, the researchers wrote, using DFT is not feasible when it comes to screening “thousands or millions of candidate structures” for potential applications.
ALFABET began with 42,577 “parent” molecules that consisted only of carbon, hydrogen, oxygen, and nitrogen atoms, which came from PubChem, a free chemistry database. DFT calculations—which took more than 80 days of computer processing time—yielded 290,664 unique BDEs. Using a predictive model developed using machine learning, the researchers were able to closely match the accuracy of DFT calculations.
What we did was use DFT to generate a large database of BDE calculations, and then trained a surrogate model to predict those values. With the machine learning model, we can get predictions so rapidly we can use them in the inner-loop of molecular design efforts.—Peter St. John
St. John said ALFABET is able to provide predictions that are used in development of detailed reaction mechanisms for combustion of biofuels, accelerating identification of the most promising fuels for reducing emissions and improving engine efficiency.
Co-Optima research focuses on how simultaneous innovations in fuels and engines can improve fuel economy and vehicle performance while also reducing emissions. Sponsored by the Department of Energy’s Office of Energy Efficiency’s Vehicle Technologies Office and Bioenergy Technologies Office, Co-Optima partners include nine national laboratories and more than 20 university and industry partners.
St. John, P.C., Guan, Y., Kim, Y. et al. (2020) “Prediction of organic homolytic bond dissociation enthalpies at near chemical accuracy with sub-second computational cost.” Nat Commun 11, 2328 doi: 10.1038/s41467-020-16201