IRMA

Interpretable and Robust Machine Learning for Mobility Analysis

Started

November 1, 2021

Status

Completed

Share this project

Artificial intelligence (AI) is revolutionizing many areas of our lives, leading a new era of technological advancement. Particularly, the transportation sector would benefit from the progress in AI and advance the development of intelligent transportation systems. Building intelligent transportation systems requires an intricate combination of artificial intelligence and mobility analysis. The past few years have seen rapid development in transportation applications using advanced deep neural networks. However, such deep neural networks are often difficult to interpret and lack robustness, which slows the deployment of these AI-powered algorithms in practice. To improve their usability in deployment, an increasing research effort has been devoted to developing interpretable and robust machine learning methods, among which the causal inference approach recently gained traction as it can provide interpretable and actionable information. However, most methods are developed for image or sequential data which cannot satisfy the specific requirements of mobility data analysis. These unique requirements have been intensively studied in the Geographic Information Science (GIScience) field but have not yet been well utilized in developing machine learning models. The goal of our project is to bring together the knowledge of GIScience and Machine Learning, advancing our understanding of how interpretable and robust machine learning methods can be tailored to mobility analysis with the support of causal inference. The outcome of this research will deepen our understanding of how to integrate AI technologies and GIScience for mobility analysis, making AI in the transportation sector more interpretable and reliable. Ultimately, we aim to facilitate the deployment of AI in intelligent transportation systems and build a safer, more efficient, and more sustainable transportation system in the future.

People

Collaborators

SDSC Team:

Simon Dirmeier

Senior Data Scientist

Simon joined the SDSC as a senior data scientist in April 2022. He conducted his doctoral research on statistical modeling of high-dimensional genetic data at ETH Zürich and earned his MSc and BSc degrees in computer science from the Technical University of Munich. His research interests lie broadly in probabilistic machine and deep learning, generative modeling, causal inference, and approximate inference. Simon is an avid open-source software contributor, with a particular enthusiasm for probabilistic programming languages and numerical computing.

Simon Dirmeier

Fernando Perez-Cruz

Former Deputy Executive Director & Chief Data Scientist

Fernando Perez-Cruz received a PhD. in Electrical Engineering from the Technical University of Madrid. He is Titular Professor in the Computer Science Department at ETH Zurich and Head of Machine Learning Research and AI at Spiden. He has been a member of the technical staff at Bell Labs and a Machine Learning Research Scientist at Amazon. Fernando has been a visiting professor at Princeton University under a Marie Curie Fellowship and an associate professor at University Carlos III in Madrid. He held positions at the Gatsby Unit (London), Max Planck Institute for Biological Cybernetics (Tuebingen), and BioWulf Technologies (New York). Fernando Perez-Cruz has served as Chief Data Scientist at the SDSC from 2018 to 2023, and Deputy Executive Director of the SDSC from 2022 to 2023

Fernando Perez-Cruz

PI | Partners:

ETH Zurich, Mobility Information Engineering Lab:

Prof. Dr. Martin Raubal
Dr. Yanan Xin
Ye Hong

More info

description

Motivation

‍Recent research on computational methods for mobility analysis has focused on black-box deep learning methods, because of their superior predictive power compared to conventional methods in many mobility-related applications. Despite their predictive performance, deep learning based algorithms typically have several shortcomings: a) they lack interpretability, b) they do generally not provide uncertainty estimates, c) it is unclear whether they are robust to distributional shifts in the input data, and d) they are typically not privacy preserving, i.e., the trained neural network weights can reflect confidential information, for instance, when trained on personal GPS tracking and location data.

Proposed Approach / Solution

For this project, we develop novel methods for enhancing the interpretability and robustness of machine learning models for mobility analysis. We first develop a benchmarking framework and datasets by simulating synthetic mobility data using both mechanistic models and generative AI models (Figure 1). The mechanistic models are designed to generate controlled interventional data for evaluating the robustness of neural networks (Hong et al., 2023). The generative denoising diffusion model is developed to simulate privacy-preserving mobility data (Dirmeier et al., 2024), since it does not rely on statistics of the data that might reveal information of individuals. Based on the simulated data, we assess how robust deep learning-based predictors are to distributional shifts in the input data and we present an approach that is able to discern in-distribution data from out-of-distribution data based on density estimation (Figure 2; Dirmeier et al., 2023). In addition, we present a data-driven approach to inform decision-making in mobility using counterfactual explanations (Figure 3; Wang et al., 2024).

Impact

‍The outcome of this research will deepen our understanding of how to integrate AI technologies and GIScience for mobility analysis, making AI in the transportation sector more interpretable and reliable. Ultimately, we aim to facilitate the deployment of AI in intelligent transportation systems, which could make the transportation system safer, more efficient, and more
sustainable in the future.

***Figure 1:*** Comparison of the privacy-preserving CDPM to mechanistic models (EPR, dEPR, dtEPR, IPT) from the mobility literature. Among other statistics that are generally employed in the mobility literature, we compute the entropy-distribution of a set of simulated location trajectories (blue and green) with the entropy-distribution of the observed location-trajectories (grey). For this evaluation, the CDPM is fairly close to the real trajectories which means that the information content between the sequences is similar.

***Figure 2:*** Out-of-distribution detection via epistemic uncertainty quantification. We developed an approach that can discern in-distribution from out-of-distribution data. When our model is applied to in-distribution test data (dark blue), it produces similar uncertainty estimates as when applied to training data (grey). When the model is applied to out-of-distribution data (green) the histogram of uncertainty estimates is shifted further to the left. One can then apply conventional distributional tests to detect if the distributions are significantly different from each other.

***Figure 3:*** Counterfactual explanations for retrospective decision making. Counterfactual explanations are used to illuminate how alterations in these input variables affect predicted outcomes, thereby enhancing the transparency of the deep learning model. We investigated the impact of contextual features on traffic speed prediction under varying spatial and temporal conditions.

Presentation

Download Presentation



Gallery

Annexe

Additional resources

—



Bibliography

Publications



Hong, Y.; Xin, Y.; Dirmeier, S.; Perez-Cruz, F.; Raubal, M. "A causal intervention framework for synthesizing mobility data and evaluating predictive neural networks" Transportation Research Interdisciplinary Perspectives 31 101398 2025 View publication 



Wang, R.; Xin, Y.; Zhang, Y.; Perez-Cruz, F.; Raubal, M. "Counterfactual Explanations for Deep Learning-Based Traffic Forecasting" Preprint 2024 View publication 



Dirmeier, S.; Hong, Y.; Perez-Cruz, F. "Synthetic location trajectory generation using categorical diffusion models" Preprint 2024 View publication 



Dirmeier, S.; Hong, Y.; Xin, Y.; Perez-Cruz, F. "Uncertainty quantification and out-of-distribution detection using surjective normalizing flows" Preprint 2023 View publication 