Scientists from an international consortium led by researchers at the US Department of Energy’s Argonne and Idaho national laboratories have recently proposed a comprehensive new data science paradigm called the Battery Data Genome—an undertaking to develop uniform data acquisition and data sharing practices across the wide-ranging battery community. These practices will create an extensive database network to enable energy storage breakthroughs using artificial intelligence (AI).
The electrochemical science that is urgently needed for a zero-carbon economy requires state-of-the-art data science. Tackling the extremely complex technical questions that battery scientists face requires huge amounts of data to generate AI and machine learning algorithms.—Argonne battery scientist Sue Babinec
Although there are some specialized data science projects for batteries, like the Electrolyte Genome Project, the undertaking to create a Battery Data Genome devoted to all aspects of the battery and that unifies work done across institutions and scales is unprecedented.—Argonne distinguished fellow and Joint Center for Energy Storage Research director George Crabtree
According to Crabtree, the Battery Data Genome will collect and house data from every step of the battery lifecycle, from discovery to development to manufacturing and all manner of deployments. Having universal standards for data management for each segment of the battery community is required for data creation to unlock the power of AI algorithms designed to identify everything from new candidate electrode materials to improved battery pack construction to cell lifetimes.
According to Argonne battery scientist Noah Paulson, scientists are interested in many different characteristics and qualities when measuring a battery’s performance. Because of this, the datasets that are collected by different groups, even those looking at the same battery in the same setup, will not be identical.
We have to find the basic set of information that should be associated with each set of data, so that we no longer have to spend time cleaning the data to fit our models.—Noah Paulson
For batteries, there are many common types of data, but there’s no uniform way of approaching them. When data come in many different formats, don’t include how they’re collected, and aren’t frequently shared among different groups, it becomes very difficult to do the kind of large-scale AI analysis and predictions necessary to speed the development and deployment of new batteries.—Argonne computational scientist Logan Ward
Transitioning different groups of researchers studying different stages of a battery’s development to create a universal set of data that can be widely accessed, understood and used represents a significant challenge, Babinec said.
To attract as many participants as possible, the Battery Data Genome offers many options for data sharing. This would potentially make participation in the Battery Data Genome more attractive to industry partners, who could take advantage of the data produced by academic or government partners without having to necessarily contribute their own.
Once scientists populate the Battery Data Genome with data, they will have to test it out. To do so, they will use “challenge problems” to validate the best AI algorithms with the data in the Battery Data Genome to solve real-world questions.
Having a standardized and easily accessible, extensive data set may spur new questions for the battery community, Crabtree said.
There are a lot of ‘unknown unknowns’ in batteries. With access to data that all conform to a universal set of standards, guided by machine learning and artificial intelligence, we may find new pathways for innovation that to date we have not yet considered.—George Crabtree
Argonne already provides open software for clean-up of existing data files with the “battery-data-toolkit”. A release of complete files for establishing cycle-life of 300 lithium-ion batteries with six different cathode chemistries will be coming later in October.
A paper announcing the Battery Data Genome will appear in the 19 October issue of Joule.