
DataXAS
Data science for X-ray Absorption Spectroscopy

Abstract
The overall goal of the project is to develop data-driven approaches for the enhancement and characterization of high throughput X-ray spectroscopy (XAS) analyses at PSI.
The primary objectives are to 1) develop generic machine learning tools for noise suppression from XAS data to improve the achievable time-resolution from quick scanning XAS on challenging materials of interest; 2) Blind source signal separation to deconvolute the principal components forming large time-series spectroscopic datasets; 3) develop a classification algorithm to extract the structural motif of catalytic species from operando XAS data; 4) expand the data science expertise of the catalysis community through education initiatives.
The project is expected to yield new, generalizable methods for XAS data processing and characterization that will be implemented at the SuperXAS and Debye beamlines of the SLS at PSI and within two major initiatives for sustainable catalysis in Switzerland: the ETH domain Catalysis Hub and the NCCR Catalysis.
People
Collaborators


During 2023, Ilnura was doing a postdoc at the Technion, Israel, working with Prof. Dr. Kfir Levy. She received her PhD in September 2022 from the Automatic Control Laboratory (IfA) at ETH, Zürich. She joined IfA in 2017 to the Prof. Dr. Maryam Kamgarpour group and was co-supervised by Prof. Dr. Andreas Krause, working on safe learning methods. She obtained her Master's Degree jointly in Applied Mathematics at Institut Polytechnique de Grenoble (France) and Moscow Institute of Physics and Technology (Russia) in 2017. Her Master's thesis research was on stochastic optimization theory under the supervision of Prof. Dr. Anatoli Juditsky. Her Bachelor of Physics and Applied Math was obtained from the Moscow Institute of Physics and Technology (Russia) in 2015. Ilnura was born and finished high school in Almaty, Kazakhstan.


Benjamín Béjar received a PhD in Electrical Engineering from Universidad Politécnica de Madrid in 2012. He served as a postdoctoral fellow at École Polytechnique Fédérale de Lausanne until 2017, and then he moved to Johns Hopkins University where he held a Research Faculty position until Dec. 2019. His research interests lie at the intersection of signal processing and machine learning methods, and he has worked on topics such as sparse signal recovery, time-series analysis, and computer vision methods with special emphasis on biomedical applications. Since 2021, Benjamin leads the SDSC office at the Paul Scherrer Institute in Villigen.
PI | Partners:
description
Motivation
Investigation of catalytic active sites is challenging, because they are often made up of only a few atoms amongst a majority of atoms of the same elements that are spectator species and often requires time-resolved synchrotron-based X-ray tools to identify them. Advanced synchrotron spectroscopy techniques give direct access to geometric structural and electronic characteristics of materials under operation. Within the broad ensemble of synchrotron techniques, X-ray absorption spectroscopy (XAS) is particularly well suited to the study of catalytic active sites elucidating, uniquely with element specificity, the local atomic configuration and electronic state of catalyst actives sites in operando conditions.

The application of state-of-the-art data science techniques to catalysis is still in its infancy and is an important short-, medium- and long-term goal in catalysis. This project aims at providing a fundamental step towards this goal by establishing robust and self-consistent workflows for data collection, processing, and analysis, applied to high-throughput spectroscopic data. The general objective of this project is to develop a data-driven methodology for data handling and processing of XAS data.
Proposed Approach / Solution
For the first task of denoising [2], we propose an efficient denoising approach of noisy XAS spectra based on careful Gaussian Processes (GP) regression with adaptive length scales. We propose using a warping procedure to preprocess the data and make the signals more stationary, which already improves all the classical denoising methods, see Figure 2. For the data with higher levels of noise, we propose GP regression for denoising of the preliminary warped signals.

In the second task of blind signal separation [3], we encounter the challenge of rotational ambiguity inherent in the classical Nonnegative Matrix Factorization (NMF) formulation. In particular, like shown on Figure 3, multiple pure component sets can lead to the same quality of the NMF representation. We propose to address this issue by regularizing the NMF formulation using a database of real pure elements.

Thirdly, using the labeled dataset of real spectra measurements, we use a classification approach [4], [5], [6], [7], for X-ray absorption near-edge structure (XANES) spectra, possibly NN based, to determine the key geometrical and structural features of the investigated chemical compounds directly from the spectrum.
Impact
Impact. The combined denoising, classification, and decomposition techniques will help domain scientists determine the particular components of the time-resolved XAS spectra with better certainty and verify them by checking fewer potential candidates for the components hence facilitating to streamline the process of compound characterization.
Presentation
Gallery
Annexe
Cover image source: Adobe Stock
Additional resources
Bibliography
- S. (ESRF) Pascarelli, ‘X-Ray Absorption Spectroscopy: Fundamentals and Simple Model of Exafs’, can be found under https://indico.ill.fr/indico/event/75/material/slides/37.pdf, 2017.
- Abe H, Aquilanti G, Boada R, Bunker B, Glatzel P, Nachtegaal M, Pascarelli S. Improving the quality of XAFS data. J Synchrotron Radiat. 2018
- A. de Juan, R. Tauler, ‘Multivariate Curve Resolution (MCR) from 2000: Progress in Concepts and Applications’, http://dx.doi.org/10.1080/10408340600970005 2007, 36, 163–176.
- M. R. Carbone, M. Topsakal, D. Lu, S. Yoo, ‘Machine-Learning X-Ray Absorption Spectra to Quantitative Accuracy’, Phys. Rev. Lett. 2020, 124, 156401.
- M. R. Carbone, S. Yoo, M. Topsakal, D. Lu, ‘Classification of local chemical environments from x-ray absorption spectra using supervised machine learning’, Phys. Rev. Mater. 2019, 3, 1–11.
- J. Timoshenko, D. Lu, Y. Lin, A. I. Frenkel, ‘Supervised Machine-Learning-Based Determination of Three-Dimensional Structure of Metallic Nanoparticles’, J. Phys. Chem. Lett., 2017, 8, 5091–5098.
- J. Timoshenko, A. I. Frenkel, ‘“inverting” X-ray Absorption Spectra of Catalysts by Machine Learning in Search for Activity Descriptors’, ACS Catal. 2019, 9, 10192–10211.
Publications
Related Pages
More projects
MAGNIFY
News
Latest news


First National Calls: 50 selected projects to start in 2025
First National Calls: 50 selected projects to start in 2025


AIXD | Generative AI toolbox for architects and engineers
AIXD | Generative AI toolbox for architects and engineers


Smartair | An active learning algorithm for real-time acquisition and regression of flow field data
Smartair | An active learning algorithm for real-time acquisition and regression of flow field data
Contact us
Let’s talk Data Science
Do you need our services or expertise?
Contact us for your next Data Science project!