From Big Data to Smart Data: Data-Efficient Machine Learning for Materials and Energy Research
Prof. Karsten Reuter
Fritz-Haber-Institut der Max-Planck-Gesellschaft,
Faradayweg 4-6, D-14195 Berlin, Germany
reuter@fhi-berlin.mpg.de
Data sciences are now also entering theoretical catalysis and energy-related research with full might. Automatized workflows and the training of machine learning approaches with first-principles data generate predictive-quality insight into elementary processes and process energetics at an undreamed-of pace. Computational screening and data mining allow to explore these databases for promising materials and extract correlations like structure-property relationships. At present, these efforts are still largely based on highly reductionist models that break down the complex interdependencies of working catalysts and energy conversion devices into a tractable number of so-called descriptors, i.e. microscopic parameters that are believed to govern the macroscopic function. Generally, static predefined databases are also the norm. Future efforts will concentrate on using artificial intelligence also in the actual generation and reinforced improvement of the reductionist models, and in devising active learning approaches that generate the truly required data on demand. In this talk, I will briefly survey these developments, providing examples from our own research, in particular on data-efficient approaches to reaction kinetics and active machine learning for the design of organic semiconductors.