Machine Learning tools for Analytical Transmission Electron Microscopy

January 1, 2021
In Progress
Share this project


Modern transmission electron microscopes are equipped with aberration correctors and high-resolution spectrometers, which enable the spatially-resolved chemical analysis of an unlimited variety of materials on the nanometric or even atomic  length scales. The ultimate ambition of a researcher is to segment and accurately quantify the analytical data recorded from such the sample, in order to precisely characterize the spatial distribution and nature of each of its phases. However, currently, such an aim can only be realized in a limited fraction of cases, owing to factors including signal convolutions and non-linear backgrounds, detector and signal noise characteristics, and segmentation challenges from the projection effects.

The aim of this project is to address and resolve these deficits, by innovating a radically new approach to the data analysis.

The aim of our project is to develop an integrated approach where all of these elements are synergistically coupled together: prior information; a directed “dictionary learning” approach using the optimal data science tools; and physical model-based quantification. With interactive feedback between these elements, it will be possible to achieve a step change in the quality of data analysis. A successful project will therefore significantly increase the ability of the researcher to leverage their data for obtaining new scientific insights and technological advances across a wide range of fields.



SDSC Team:
Guillaume Obozinski
Nathanaël Perraudin

PI | Partners:

Electron Spectrometry and Microscopy Laboratory:

  • Prof. Cécile Hébert
  • Dr. Adrien Teurtrie
  • Dr. Duncan Alexander
  • Hui Chen

More info



The spectroscopic data obtain in scanning transmission electron microscopes (STEM) are in most cases not straightforward to analyse because of noise and mixed features (both spectrally and spatially). The current state-of-the-art in this community in terms of data analysis is limited to basic ML algorithms such as PCA or NMF which, in general, do not retrieve the original physical features of the observed sample. In the MLATEM project we aim at designing a physics-guided ML algorithm which will retrieve the physically-correct features of the observed sample. In that perspective, the SDSC will provide an expertise on cutting-edge ML techniques to develop this algorithm while the collaborating team will provide domain knowledge.


Our goal is to develop a new standard algorithm for the electromicroscopy community.



Additionnal resources


  1. D.D. Lee, H.S. Seung, Learning the parts of objects by non-negative matrix factorization, Nature. 401 (1999) 788.
  2. Trebbia, P., Bonnet, N., 1990. Ultramicroscopy 34, 165–178 & Bonnet, N., Brun, N., Colliex, C., 1999.
  3. K.J. Dudeck, M. Couillard, S. Lazar, C. Dwyer, G.A. Botton, Quantitative statistical analysis, optimization and noise reduction of atomic resolved electron energy loss spectrum images, Micron. 43 (2012) 57–67.
  4. A.B. Yankovich, C. Zhang, A. Oh, T.J.A. Slater, F. Azough, R. Freer, S.J. Haigh,
  5. R. Willett, P.M. Voyles, Non-rigid registration and non-local principle component analysis to improve electron microscopy spectrum images, Nanotechnology. 27 (2016) 364001.
  6. B.H. Martineau, D.N. Johnstone, A.T.J. van Helvoort, P.A. Midgley, A.S. Eggeman, Unsupervised machine learning applied to scanning precession electron diffraction data, Adv. Struct. Chem. Imaging. 5 (2019) 3.
  7. M. Shiga, K. Tatsumi, S. Muto, K. Tsuda, Y. Yamamoto, T. Mori, T. Tanji, Sparse modeling of EELS and EDX spectral imaging data by nonnegative matrix factorization, Ultramicroscopy. 170 (2016) 43–59.
  8. Verbeeck, J. and S. Van Aert (2004), 'Model based quantification of EELS spectra', Ultramicroscopy, 101 (2-4), 207-224.


Related Pages

More projects


In Progress
Machine Learning for the Future Circular Collider Design
Big Science Data


In Progress
Real-time cleansing of snow and weather data for operational avalanche forecasting
Energy, Climate & Environment


AI-augmented architectural design
Energy, Climate & Environment


In Progress
Extracting activity from large 4D whole-brain image datasets
Biomedical Data Science


Latest news

PassGPT | Using language models to enhance password security
February 6, 2024

PassGPT | Using language models to enhance password security

PassGPT | Using language models to enhance password security

PassGPT is a Large Language Model for password generation trained on leaked passwords, which can outperform existing methods based on generative adversarial networks by guessing twice as many unseen passwords.
ADORE | A benchmark dataset in ecotoxicology to foster the adoption of machine learning
January 24, 2024

ADORE | A benchmark dataset in ecotoxicology to foster the adoption of machine learning

ADORE | A benchmark dataset in ecotoxicology to foster the adoption of machine learning

Applying machine learning to ecotoxicology could help reduce the number of animal tests, costs, and animals sacrificed while preserving the accuracy of the in vivo tests.
License Flowers | Art and AI at SDSC
February 21, 2024

License Flowers | Art and AI at SDSC

License Flowers | Art and AI at SDSC

An adventure to create art using AI to raise awareness on code licenses

Contact us

Let’s talk Data Science

Do you need our services or expertise?
Contact us for your next Data Science project!