
Synthetic Data for Biomedical Applications


Before joining SDSC, Arshjot Khehra received his MSc in Artificial Intelligence from USI Lugano, where he completed his thesis on hierarchical graph reinforcement learning. Previously, he worked for 4+ years across India and Singapore gaining data science experience in insurance, logistics, and manufacturing sectors. He also holds a BSc in Industrial Engineering from PEC Chandigarh. Over the course of his career, Arshjot worked on a wide array of projects, such as, handwritten text recognition and generation, voice matching across phone call recordings, policy lapse rate prediction for customer retention, and automated insurance claim processing.


Matthias Galipaud obtained his PhD in evolutionary biology in 2012 from the University of Burgundy in Dijon (France), and held postdoctoral positions as a mathematical biologist at the university of Bielefeld (Germany) and the university of Zurich, where he researched the evolutionary theories of aging and mate choice. In 2020, he became a data scientist, developing machine learning solutions for startups in Switzerland and Australia before joining the SDSC Innovation Team in November 2022.


Valerio started his career working for 7 years as a particle-physics researcher at CERN. There, he used state-of-the-art techniques to extract information from data, especially to search for traces of dark matter in particle collisions. Since 2016, he has worked in consulting, applying data science in several industries. First, he joined the Quant team of Ernst & Young in Geneva. Later, he created his own company, SamurAI sàrl, providing consulting services for his clients. He also has a passion for teaching very complex subjects in simple terms. That is why he particularly enjoys offering training programs to private companies and universities. Valerio joined the SDSC in Mai 2022 as a Principal Data Scientist with the mission of accompanying industrial partners and other institutions through their data science journey.

Presentation
Overview
Recently, synthetic data has enjoyed growing interest from the biomedical sector. Synthetic patient data helps in leveraging privacy issues. Augmenting datasets with synthetic records helps with increasing classification model training performance in the face of scarce health data and rare minority classes (e.g. rare diseases).
During this one-day workshop, organized by CHUV and SDSC, we will review available tools for synthetic data generation and use cases in the biomedical and pharmaceutical sectors.
Details
Target Audience
Experienced professionals, executives, and data scientists in the biomedical and pharmaceutical sectors wishing to acquire hands-on knowledge on synthetic data generation and usage.
As the workshop involves hands-on sessions, prior experience with the programming language Python is required. The workshop will be held in English.
Programme
Objectives
By the end of the day, participants will:
- Have a grasp of current available methods for synthetic tabular and image data generation.
- Have identified use cases and challenges of synthetic data in the biomedical and pharmaceutical sector.
- Have hands-on experience with generating synthetic data with python and evaluating its quality.
Agenda
09:00
Welcome coffee
09:30
Welcome & introduction
09:40
Synthetic data: How it works and where it is currently used
10:10
GANs, VAEs and diffusion: a deeper dive
10:55
Break
11:15
Applications in healthcare
11:45
Towards the use of synthetic data in biomedical applications: Evaluation of privacy and utility tradeoff
12:15
Lunch
13:15
Applications in the pharmaceutical industry
14:00
Hands-on (part 1): Understanding synthetic data generation (e.g. generating synthetic medical images for image classification)
14:50
Break
15:10
Hands-on (part 2): Understanding synthetic data evaluation (e.g., sharing survival data evaluating the utility and privacy of tabular synthetic data
16:40
Panel discussion
17:10
Concluding remarks, Apéro & Networking
Instructors
Jeremie Despraz, MS, Principal Data Scientist in Clinical AI, CHUV
Matthias Galipaud, PhD, Senior Data Scientist, SDSC, ETHZ
Beyrem Kaabachi, MS, Data Scientist in Health Data Privacy, CHUV
Arshjot Khehra, Data Scientist, SDSC ETHZ
Jean-Louis Raisaro, PhD, Tenure-Track Assistant Professor in Biomedical data science, CHUV
Alena Simalatsar, PhD, Assistant Professor, HES-SO
Practical Information
Price
Non-members: 150/pers
Ongoing collaborations with SDSC or BDSC: free
Availability & Registration
52 registered participants - Registration closed.
Other events

Study-a-thon: From Bits to Breakthroughs in Clinical AI Modeling


Oksana is a disruptive innovator bringing her positive energy to projects. Driven by her curiosity and can-do attitude she excels in industrial and academic contexts. Oksana earned her PhD in Life Sciences and Bioinformatics from the University of Lausanne after two MSc in Bioinformatics and in Information Systems from the University of Geneva. For more than 10 years, she has been committed to actively promoting the value of data science and advocating the best practices for reproducible and ethical research. She believes that Swiss Data Science Center is a key player in building a competitive data economy in Switzerland leveraging its innovative potential and renown commitment to quality.


As an EPFL Life Science Engineer, my main interest is to do science with an impact.FAIR principals guide my work style, and I strive for user-centric infrastructure to encompass data science in the biomedical and governmental spheres. I have experience in Global Health, working with multi-hospital surveillance system for pandemics, as well as training data scientist (thegraphcourses.org).My core side-interests lie in ocean conservation notably cetacean conservation, biodiversity, and untreated health problematics from lower and middle income countries.I have solid hard skills in problem-solving, data engineering in AI/ML, and have developed soft skills in creativity and social integration. I have acquired domain knowledge in a diversity of fields: from biology-related sciences such as human gut microbiology, epidemiology, and environmental sciences, as well as social sciences such as anthropology and psychology.I am always happy to engage with new people on innovative and impactful thematics so please do reach out !


Stefan has a background in Biology and decided to move towards evolutionary bioinformatics for both his MSc and PhD.Over the years, he developed a passion for the entire data analysis process: from collecting data, to analyzing and presenting results. Presentations, particularly opportunities for public speaking, are activities he enjoys since he values communication a lot. In order to follow this passion and deepen his knowledge on systems to collect and manage data, he joined SDSC in 2023 as a Biomedical Data Engineer.Outside work, Stefan is an avid reader of sci-fi books (but not only!), enjoys swimming, running, and biking both competitively and casually and enjoys plenty of activities with friends, especially when beer is involved.


Almut Lütge joined the ORDES team in Zurich as Biomedical Data Engineer, in January 2024.
Almut did both her Bachelor and Master in molecular biotechnology with a major in bioinformatics at the University of Heidelberg in Germany.
After her masters she worked as a research assistant on population genetics at the NTNU in Trondheim, Norway.
In 2018 Almut started her PhD about benchmarking of single cell analysis tools at the University of Zürich, followed by a short PostDoc in pharmaceutical immunology at ETH Zürich.
Almut enjoys data-driven problem-solving and highly value open science.

Energy Hack Days 2025


Roberto holds an M.Sc. and a Ph.D. in Particle Physics from the University of Torino, Italy. He has worked for several years in fundamental research as a senior fellow and data scientist at the CERN Experimental Physics division and on a research project supported by the Belgian National Fund for Scientific Research (FNRS). In 2018 he moved to EPFL to work on data mining and Machine Learning techniques for the built environment and renewable energies. He has started and led multiple collaborations with academic and industry partners in the energy domain. Roberto joined the SDSC in September 2021 as a Principal Data Scientist with the mission of accompanying industries, NGOs and international organizations through their data science journey.


Rok obtained a B.A. in Physics from Washington University in St.Louis in 2003. After obtaining his PhD in theoretical Astrophysics from the University of Washington in 2010, Rok spent several years as a Postdoctoral researcher at the Institute for Computational Science, University of Zürich. Seeking new challenges, he moved to the ETH Scientific IT Services group, where he helped researchers across different ETH domains solve their (big) data analysis problems. He specialized in optimizing and scaling up data analysis tasks by mapping them to high-performance computing resources. Since July 2017 he has been at the Swiss Data Science Center developing Renku, the Center's data science platform.

SDSC-Connect: Translating AI to Clinical Practice


Nora earned her Ph.D. in Computer Science/Bioinformatics from the University of Tübingen, where she focused on the in silico design of peptide-based vaccines using combinatorial optimization and machine learning. During her postdoctoral fellowship at Memorial Sloan Kettering Cancer Center in New York, and later as a staff scientist at the New York Genome Center, she worked on metagenomics, infectious diseases, and cancer. In 2015, Nora joined NEXUS Personalized Health Technologies at ETH Zurich where her focus shifted towards the management of clinical and biomedical research data. In May 2024, she joined the Swiss Data Science Center as Head of Biomedical Data Science.


Anna joined SDSC as a Data Scientist focusing on industry collaborations in July 2019. She completed her PhD in Bioinformatics at the University of Luxembourg, where she analysed large-scale heterogeneous datasets and leveraged multiple disciplines: Statistics, Network Analysis, and Machine Learning. Before joining SDSC, Anna worked as a Data Scientist at Deloitte Luxembourg, with a focus on computer vision and time-series analysis.Currently, Anna is a Principal Data Scientist based at the ETH Zurich office, where she leads biomedical collaborations with industry partners. Anna works on a range of projects: protein properties prediction, biomanufacturing optimization, statistical model evaluation and others.


Guillaume Obozinski graduated with a PhD in Statistics from UC Berkeley in 2009. He did his postdoc and held until 2012 a researcher position in the Willow and Sierra teams at INRIA and Ecole Normale Supérieure in Paris. He was then Research Faculty at Ecole des Ponts ParisTech until 2018. Guillaume has broad interests in statistics and machine learning and worked over time on sparse modeling, optimization for large scale learning, graphical models, relational learning and semantic embeddings, with applications in various domains from computational biology to computer vision.


Snežana joined the SDSC industry team in June 2021 on a mission to advance adoption of modern data driven solutions in the domain of public health care. She has a background in experimental particle physics with a Diploma from the ETH Zurich and a PhD from the University of Geneva. Snežana pursued fundamental research in the field of high energy physics at CERN for nine years, harnessing the power of machine learning and statistical methods to uncover the traces of new physics in petabytes of proton-proton collision data and to develop innovative particle identification algorithms. Since 2018, Snežana served as a Data Science consultant, supporting partners from industries such as manufacturing, insurances, compliance services and online platforms in creating business value from internal and external data.


Marisol has a degree in Law and more than 15 years of experience working as a notary officer in Madrid. After relocating to Switzerland with her family, she obtained a certification to teach Spanish as a foreign language, dedicating four years to teaching Spanish online to students of all ages and backgrounds. Marisol has returned to her professional roots as an administrative assistant, joining the SDSC team in June 2023.
%20Medium.jpeg)
KI für Dienstleister im öffentlichen Raum


Saurabh Bhargava, joined the SDSC as a Principal Data Scientist in the Industry Cell at the Zürich office in 2022. Saurabh previously worked in the retail sector and the advertising industry in Germany. He lead and built various data products for customers using state of the art machine learning methods and industrializing them thereby adding value for the customers. He completed his PhD from ETH Zürich in June 2017 specializing in machine learning applications on Audio data. He obtained his Master’s and Bachelor’s degrees from EPFL and Indian Institute of Technology (IIT), Roorkee, India in 2011 and 2009 respectively. His interests and expertise are in combining state of the art data science and data engineering tools for building scalable data products.


Christian joined the SDSC in July 2021 as a data scientist in the industry cell. Before that he worked at the Media Technology Center at ETH Zurich to develop new ML technology for their industry partners. He completed his Master's degree in Mathematics at ETH Zurich (2017) with focus on algorithmics and machine learning. After his studies he worked as a online marketing data analyst in the news publishing business. His expertise lies in statistics, NLP and software engineering.

AI Insights for CFOs


Silvia holds an MSc in Computer Science from EPFL and a PhD in Computer Science from the University of York, UK. She has been a senior research fellow at the University of Trento and later at Politecnico di Milano, Italy. Here, she had the chance to work on Marie Curie and ERC projects relating to natural language processing. From 2012 to 2019, she was a Senior Manager and NLP expert at ELCA Informatique Switzerland, whose AI department she helped create and expand. Silvia joined the Swiss Data Science Center in 2019 and is currently its Chief Transformation Officer, in charge of the team leading organizations to digital transformation.


Saurabh Bhargava, joined the SDSC as a Principal Data Scientist in the Industry Cell at the Zürich office in 2022. Saurabh previously worked in the retail sector and the advertising industry in Germany. He lead and built various data products for customers using state of the art machine learning methods and industrializing them thereby adding value for the customers. He completed his PhD from ETH Zürich in June 2017 specializing in machine learning applications on Audio data. He obtained his Master’s and Bachelor’s degrees from EPFL and Indian Institute of Technology (IIT), Roorkee, India in 2011 and 2009 respectively. His interests and expertise are in combining state of the art data science and data engineering tools for building scalable data products.
Contact us
Let’s talk Data Science
Do you need our services or expertise?
Contact us for your next Data Science project!