
CHEMSPEC
Cost-effective chemical speciation monitoring of particulate matter air pollution

Abstract
Ambient particulate matter is one of the greatest environmental health risks to society, but cost-effective methods for chemical characterization and source apportionment necessary to inform robust regulatory strategies are lacking. Our team has demonstrated the possibility to use infrared spectroscopy as an inexpensive optical probe to obtain chemically informative spectra of particulate matter, and this analytical technique is increasingly being used in air quality monitoring networks and research campaigns worldwide. To obtain reliable predictions at such a prolific scale, advanced data-driven methods are necessary to fully take advantage of the rich but complex infrared spectra acquired from these samples.
CHEMSPEC brought together an unprecedented data set from multiple monitoring networks and laboratory-generated particles to learn the relations between chemical composition and spectroscopic signatures that can be used in quantitative modeling of chemical species. The results of this project have provided a step forward in the characterization of particulate matter, not only in existing monitoring networks, but in critical areas of the world where the particulate matter composition is virtually unknown.
People
Collaborators


Francois received his PhD in December 2018 from Stellenbosch University (SU) in South Africa where he focused on improving inference algorithms for probabilistic graphical models. Before joining the SDSC, Francois worked as a senior lecturer at the Department of Statistics and Actuarial Science at SU. His research interests include probabilistic graphical models, time series forecasting, causal inference and discovery, extreme value theory, and machine learning in general.


Ekaterina received her PhD in Computer Science from Moscow Institute for Physics and Technology, Russia. Afterwards, she worked as a researcher at the Institute for Information Transmission Problems in Moscow and later as a postdoctoral researcher in the Stochastic Group at the Faculty of Mathematics at University Duisburg-Essen, Germany. She has experience with various applied projects on signal processing, predictive modelling, macroeconomic modelling and forecasting, and social network analysis. She joined the SDSC in November 2019. Her interests include machine learning, non-parametric statistical estimation, structural adaptive inference, and Bayesian modelling.


Guillaume Obozinski graduated with a PhD in Statistics from UC Berkeley in 2009. He did his postdoc and held until 2012 a researcher position in the Willow and Sierra teams at INRIA and Ecole Normale Supérieure in Paris. He was then Research Faculty at Ecole des Ponts ParisTech until 2018. Guillaume has broad interests in statistics and machine learning and worked over time on sparse modeling, optimization for large scale learning, graphical models, relational learning and semantic embeddings, with applications in various domains from computational biology to computer vision.
description
Motivation
Infrared spectroscopy has been shown to be a relatively inexpensive tool for generating chemically informative spectra of particulate matter and is becoming increasingly popular in air quality monitoring networks worldwide. To fully develop this tool, machine learning methods are required to remove interference from the spectra and decompose the resulting curves into their chemical components so that accurate, interpretable and robust predictions can be made.
Proposed Approach
To remove interference from the spectra, we designed a probabilistic model by combining a loss function with a covariance model fitted to blank spectra, i.e. spectra free from any ambient particulate matter. In this correction procedure, all hyper-parameters are calibrated using the empirical Bayes principal. Once the background interference has been removed, the resulting spectra can be fed to an EM algorithm to learn the profiles of chemical species and perform predictions.
Impact
The anticipated impact of this project is the expansion in capabilities of monitoring networks worldwide – especially in areas where particle composition is virtually unknown – to cost-effectively provide chemical composition information required to identify the key sources and constituents contributing to PM2.5 at every location critical for shaping necessary emissions control policies, evaluating air quality action plans, supporting air quality modeling, and remote sensing efforts.
Note. Here PM2.5 is the mass concentration of ambient particulate matter less than 2.5 micrometers in diameter.

Presentation
Gallery
Annexe
Additional resources
Bibliography
- Solomon, P. A. et al. U.S. National PM2.5 Chemical Speciation Monitoring Networks—CSN and IMPROVE: Description of networks. Journal of the Air & Waste Management Association 64, 1410–1438. eprint: https://doi.org/10.1080/10962247.2014.956904.
- Reggente, M., Dillner, A. M. & Takahama, S. Analysis of functional groups in atmospheric aerosols by infrared spectroscopy: systematic intercomparison of calibration methods for US measurement network samples. Atmospheric Measurement Techniques 12, 2287–2312.
- Bürki, C. et al. Analysis of functional groups in atmospheric aerosols by infrared spectroscopy: method development for probabilistic modeling of organic carbon and organic matter concentrations. Atmospheric Measurement Techniques 13, 1517–1538. AMT – Volume 13, 2020 1517/2020/ (2020).
- Boris, A. J. et al. Quantifying organic matter and functional groups in particulate matter filter samples from the southeastern United States – Part 2: Spatiotemporal trends. English. Atmospheric Measurement Techniques 14. Publisher: Copernicus GmbH, 4355–4374. issn: 1867-1381. https://amt.copernicus.org/articles/14/4355/2021/ (2021) (June 2021).
Publications
Related Pages
More projects
MAGNIFY
News
Latest news


First National Calls: 50 selected projects to start in 2025
First National Calls: 50 selected projects to start in 2025


AIXD | Generative AI toolbox for architects and engineers
AIXD | Generative AI toolbox for architects and engineers


Smartair | An active learning algorithm for real-time acquisition and regression of flow field data
Smartair | An active learning algorithm for real-time acquisition and regression of flow field data
Contact us
Let’s talk Data Science
Do you need our services or expertise?
Contact us for your next Data Science project!