Pheno-Mine: Extracting dynamic ideotypes from seasonal image time series of wheat taken in the field

September 30, 2022
In Progress
Share this project


Adapting field crops to a changing climate requires a profound understanding of growth dynamics in relation to the environment. Recent advances in field phenotyping promised to facilitate the collection of essential data for such analyses. Indeed, in the last seven years, the field phenotyping platform (FIP) at ETH constantly collected image time series of more than 350 wheat genotypes. Analyzing these data involves a significant number of steps, most importantly the extraction of low-level features from images, the modeling of the dynamics of such low-level features, and relating these dynamics to target traits such as yield, protein content, earliness, and drought tolerance. The complexity of the involved steps prevents researcher from systematically “mining” image time series without previously defining a set of growth dynamics’ parameters to optimize. Consequently, one may not expect to find unforeseeable associations between growth dynamics and target traits. To overcome this limitation, this project aims to combine contemporary deep learning such as convolutional and recurrent neural network, together with generative image models. A neural network will be trained on image time series of genotypes with the task to predict their performance regarding a target trait, e.g., yield. The visualized response buildup in an a priori information constrained latent space will then be treated as new trait, a so-called “dynamic ideotype”, that represents a characteristic growth trajectory. Analyzing these ideotypes will enable researcher to identify favorable growth dynamics. Consequently, the newly gained insights will result in a better understanding of growth dynamics and responses to the environment. In the long term, such methods will hopefully allow plant physiologists and breeders to mine their datasets with less bias towards their initial hypothesis, allowing better mitigating future climate scenarios, and consequently contribute to ensure global food supply.



SDSC Team:
Xiaoran Chen
Paraskevi Nousi
Michele Volpi

PI | Partners:

ETH Zurich, Group of Crop Science:

  • Prof. Dr., Walter, Achim
  • Dr., Roth, Lukas

More info



The aim of the project is to learn latent phenotypes, or ideotypes, for crops given their temporal information of growth, weather data and measurements, to discover new traits and couple them with genotypes. Specifically, using both temporal imagery data and numeric measurements, a latent description can be learned to embed both information and also indicate genetic similarity, while traits of the crops can be disentangled to ensure interpretability.

Figure 1. Overview of the project. Sequences of images are taken at different growth stages over a collection of plots and over six years. The image sequences are mapped to a latent space by neural networks, which are then used to predict target plant traits. Growth patterns and phenotypic behaviors are identified in this way in the image sequences.

Proposed Approach / Solution

Within the scope of the project, the problem is approached by encoding image and height sequences into a common vector representation before mapping the latter to the plant traits. Separate models for images and height sequences have been investigated before fusing the two into a single approach. Furthermore, genomic selection approaches, which have been studied in recent literature, are investigated within the purposes of the project, i.e., using genotype marker data in combination with environmental data to predict plant traits.

One major challenge in this task is to accurately learn to predict the traits of different wheat plants in unseen environments (e.g., different years) and of unseen genotypes in unseen environments, which relies very heavily on the phenotypic behavior captured by the input data. The discrepancy between the training and test data is highlighted in Figure 2, where FIP.2019 is used as the test set and split into two subsets: unseen environments and unseen genotypes in unseen environments. Despite learning the training set well, this baseline model does not generalize well to the test subsets; although it performs marginally better in the 'unseen environments' subset.

Figure 2. Predicted value (x axis) vs true value (y axis) for 5 plant traits. The x=y line is shown as well to indicate that samples around it have high correlation with the ground truth. Despite learning the training set well (middle column), models struggle when predicting the traits of wheat plants in unseen environments (left column) and even more so when predicting the traits of unseen genotypes in unseen environments (right column).


Growth of plants is a complicated and dynamic process that is difficult to measure and quantify manually. The field phenotyping platform (FIP) at ETH Zurich captures image time series at high image and temporal resolution and aims to analyze the data and extract ideotypes and help breeders better adapt their plants to future foreseeable environment conditions.



Additional resources


  1. Ubbens, Jordan, et al. "Latent space phenotyping: automatic image-based phenotyping for treatment studies." Plant Phenomics 2020 (2020).
  2. Dosovitskiy, Alexey, et al. "An image is worth 16x16 words: Transformers for image recognition at scale." arXiv preprint arXiv:2010.11929 (2020).
  3. Caron, Mathilde, et al. "Emerging properties in self-supervised vision transformers." Proceedings of the IEEE/CVF international conference on computer vision. 2021.
  4. Kim, Wonjae, Bokyung Son, and Ildoo Kim. "Vilt: Vision-and-language transformer without convolution or region supervision." International conference on machine learning. PMLR, 2021.


Related Pages

More projects


In Progress
Interpretable and Robust Machine Learning for Mobility Analysis
No items found.


In Progress
Feature Learning for Bayesian Inference
No items found.


In Progress
Personalized epidural electrical stimulation of the lumbar spinal cord for clinically applicable therapy to restore mobility after paralyzing spinal cord injury
No items found.


In Progress
Directed Imitation During Vocal Learning
No items found.


Latest news

Smartair | An active learning algorithm for real-time acquisition and regression of flow field data
May 1, 2024

Smartair | An active learning algorithm for real-time acquisition and regression of flow field data

Smartair | An active learning algorithm for real-time acquisition and regression of flow field data

We’ve developed a smart solution for wind tunnel testing that learns as it works, providing accurate results faster. It provides an accurate mean flow field and turbulence field reconstruction while shortening the sampling time.
The Promise of AI in Pharmaceutical Manufacturing
April 22, 2024

The Promise of AI in Pharmaceutical Manufacturing

The Promise of AI in Pharmaceutical Manufacturing

Innovation in pharmaceutical manufacturing raises key questions: How will AI change our operations? What does this mean for the skills of our workforce? How will it reshape our collaborative efforts? And crucially, how can we fully leverage these changes?
Efficient and scalable graph generation through iterative local expansion
March 20, 2024

Efficient and scalable graph generation through iterative local expansion

Efficient and scalable graph generation through iterative local expansion

Have you ever considered the complexity of generating large-scale, intricate graphs akin to those that represent the vast relational structures of our world? Our research introduces a pioneering approach to graph generation that tackles the scalability and complexity of creating such expansive, real-world graphs.

Contact us

Let’s talk Data Science

Do you need our services or expertise?
Contact us for your next Data Science project!