Citizen-Controlled

Citizen-controlled Data Science for Multiple Sclerosis Research

Started
April 1, 2018
Status
Completed
Share this project

Abstract

Multiple Sclerosis (MS) is a complex chronic disease whose manifestation depends on clinical, environmental and individual factors and for which prediction of individual progression is poor and often treatment decisions are hampered by the lack of objective parameters (e.g., related to fatigue).

MS data was employed as the use case within the MIDATA project which aims at developing an ethically fair and secure data infrastructure that permits collection, integration and analysis of diverse types of data under the control of the citizen/patient.

The task of SDSC within this project was to extract data from the doctor's reports collected and stored with the hospital software kisim. A doctor's report is a semi-structured text of a few to several dozen lines where each line is associated with a topic such as diagnosis, current state, history, MRI or medication. The neurology clinic at the university hospital of Zurich USZ has developed and is maintaining the database seantis to store MS patients records in a structured manner. So far, the seantis database has been filled manually by transcribing information from the doctor's reports to the corresponding fields.

People

Collaborators

SDSC Team:
Lilian Gasser
Luis Salamanca
Fernando Perez-Cruz

PI | Partners:

ETH Zurich, Institute of Molecular Systems Biology:

  • Dr. Ernst Hafen

More info

ETH Zurich, Department of Computer Science:

  • Prof. Dr. Gunnar Rätsch
  • Prof. Dr. Christian Holz
  • Dr. Cristobal Esteban Aizpiri
  • Liliana Barrios

More info

USZ, Klinik für Neurologie:

  • Dr. med. Andreas Lutterotti
  • Marc Hilty
  • Dr. med. Roland Martin

More info

BFH Bern, Institute for Medical Informatics:

  • Dr. François von Kaenel

More info

ETH Zurich, Scientific IT Services

  • Bräunlich Gerhard

More info

description

Goal

Semi-automatic update of the MS database seantis using the doctor's reports.

Solution

  • Build embedding of doctor's reports using Doc2Vec where one text line corresponds to one document.
  • Multi-class classification of text lines using embedding vectors as features and manually assigned labels as targets. This intermediate step allows to predict text line labels for new  unseen doctor's reports.
  • For specific parts of seantis (MS diagnosis, MRI information, ...), tailored classification procedures were developed to predict columns of interest, e.g. MS diagnosis type, type of MRI (spinal or cranial) and whether new and/or contrast medium enhancing lesions were detected.

Impact

Facilitate the update of the seantis database by providing predictions for fields of interest based on extracted information from doctor's reports.

Gallery

Figure 1: General overview
Figure 2: Applied methodology

Annexe

Additional resources

Bibliography

Publications

Related Pages

More projects

LUCID National Data Stream

In Progress
Low Value of Care in Medical Hospitalized Patients - a National Data Stream on Quality of Care in Swiss University Hospitals
Health & Biomedical

Syngenta: Steam consumption optimization

Completed
Reliable strategies to save energy in Syngenta’s Kaisten plant
Energy & Sustainability
Private sector

Pilot project ENERBAT

Completed
Data-Driven Pathways to Net Zero for the Canton of Vaud’s Building Portfolio
Energy & Sustainability
Climate & Environment
Public sector

EKZ: Synthetic Load Profile Generation

Completed
Reliable electricity load monitoring for non-metered nodes
Energy & Sustainability
Public sector

News

Latest news

Coding the Future: Energy Data Hackdays Expand to French-speaking Switzerland
May 7, 2026

Coding the Future: Energy Data Hackdays Expand to French-speaking Switzerland

Coding the Future: Energy Data Hackdays Expand to French-speaking Switzerland

Held at the SDSC headquarters at Biopôle, the Energy Data Hackdays gather 100 experts to tackle 5 energy and grid challenges.
Science des données : le SDSC et le Canton de Vaud soutiennent quatre projets appliqués
April 30, 2026

Science des données : le SDSC et le Canton de Vaud soutiennent quatre projets appliqués

Science des données : le SDSC et le Canton de Vaud soutiennent quatre projets appliqués

Le SDSC et le Canton de Vaud ont retenu quatre projets parmi les 57 soumissions reçues lors de leur deuxième appel à projets.
Le Swiss Data Science Center inaugure son siège au Biopôle de Lausanne
March 12, 2026

Le Swiss Data Science Center inaugure son siège au Biopôle de Lausanne

Le Swiss Data Science Center inaugure son siège au Biopôle de Lausanne

Le SDSC inaugure aujourd'hui son siège au campus Biopôle de Lausanne, dans le cadre d'un partenariat stratégique avec l'État de Vaud.

Contact us

Let’s talk Data Science

Do you need our services or expertise?
Contact us for your next Data Science project!