Open Research Projects

Open research projects

The projects listed below are part of the upcoming call for applications, opening on August 19. Additional projects may be added throughout the call period.

Abstract

Adaptive Learning Dynamics of the Immune System

Abstract

The goal of this project is to invert the paradigm of applying extrinsic AI/ML algorithms to practically understand biological data. Instead, we want to study how instrinsic biological learning functions in the immune system work by employing recent theoretical understanding about dynamics of artificial neural networks. The main setting we are going to investigate is the immune system. There has been recent theoretical progress suggesting that self-adaptation capabilities could be key to enhancing its function considerably. Yet, this requires a better understanding of the main mechanisms via a combination of adaptive network models in combination with statistical data assimilation techniques of macroscopic observables. In this project we aim to lay substantial groundwork for this program combining methods from biology, computation, data science, dynamics, and machine learning. We are going to focus on building benchmark models for this context validating them via simulation with forward uncertainty propagation and bifurcation analysis.

 

Adaptive Learning Dynamics of the Immune System

Domain: Medicine & Health / Life Sciences

Supervisors: Christian Kuehn, TUM, Fabian Theis, Helmholtz Munich/TUM

 

Abstract

AI-Guided Design of Disordered Protein Regions

Abstract

Intrinsically disordered protein regions (IDRs) are widespread in the human proteome but defy the classical structure–function paradigm. IDRs often engage in regulatory interactions with structured protein domains, where their dynamic interfaces can modulate access to functional sites, regulate enzymatic activity, or mediate complex assembly. The goal of this project is to develop novel computational strategies for designing IDRs that dynamically interact with structured protein domains across kinetic and thermodynamic ranges. To this end, the project will explore combinations of state-of-the-art generative models to design IDR sequences with tunable regulatory properties. Experimental validation will be performed using biochemical assays and solution NMR spectroscopy to assess binding affinity, kinetics, and specificity of the designed IDRs. These experiments will be primarily carried out by a dedicated postdoc in close collaboration with the PhD candidate. The doctoral candidate will focus on computational model development, while integrating experimental feedback to iteratively improve model performance in a “lab-in-the-loop” workflow.

AI-Guided Design of Disordered Protein Regions

Domain: Medicine & Health / Life Sciences

Supervisors: Iva Pritisanac, Helmholtz Munich, Thomas Reid Alderson, Helmholtz Munich

Abstract

Closed-loop dynamical control of brain activation patterns in mice and neuronal organoids

Abstract

This project will develop a predictive model of brain activation patterns using calcium imaging and electrophysiology data from mouse cortex and human organoids. Combining global pharmacological modulation with precise sensory and optogenetic stimulation will enhance the contrastive learning framework for dynamics identification developed by the Schneider Laboratory to recognize increasingly complex neural patterns (goal 1) to then deploy a predictive model for closed-loop control of targeted brain regions (goal 2). The study will validate the model’s generalizability while developing applications for memory restoration through pattern-specific interventions. The methodology parallels clinical neuroimaging approaches (fMRI/TMS), with the potential for advancing personalized neuromodulation therapies. Additionally, findings will inform human organoid engineering for tissue replacement and brain-machine interface applications.

Closed-loop dynamical control of brain activation patterns in mice and neuronal organoids

Domain: Medicine & Health / Life Sciences

Supervisors: Steffen Schneider, Helmholtz Munich, Gil Westmeyer, Helmholtz Munich/TUM

Abstract

Statistical Methods for Multi-Modal Data Analysis in Human Disease Research

Abstract

Non-communicable diseases (NCDs) are responsible for 70% of global deaths, with cardiovascular diseases, cancers, respiratory diseases, and diabetes being the most prevalent. The growing burden of these diseases necessitates a shift from reactive treatment to predictive and preventive healthcare. This PhD research aims to develop novel statistical methodologies to analyze multi-modal longitudinal omics data, integrating infrared (IR) molecular fingerprinting, mass spectrometry (MS)-based proteomics, and nuclear magnetic resonance (NMR)-based metabolomics. Using data from the Health for Hungary (H4H) and German National Cohort (NAKO) studies, this research will focus on statistical trajectory modeling and machine learning to detect early disease markers and predict disease onset. The objectives include designing statistical models for disease progression, integrating multi-modal data sources, applying machine learning algorithms for disease prediction, optimizing statistical study design, and validating findings with independent datasets. The methodology involves, among others, mixed-effects models and functional data analysis for trajectory modeling. Multi-modal data integration will leverage dimension reduction techniques. Machine learning approaches such as ensemble learning, and interpretable AI methods will enhance predictive modeling. Statistical study design optimization will include sample size determination, missing data imputation, and cross-validation strategies. This research is expected to contribute novel statistical frameworks for disease trajectory analysis, improve predictive modeling of disease progression, and establish scalable methodologies for large-scale health studies. The outcomes will support early intervention strategies, enhance personalized healthcare, and contribute to global efforts in preventive medicine.

Statistical Methods for Multi-Modal Data Analysis in Human Disease Research

Domain: Life Sciences

Supervisors: Göran Kauermann, LMU, Ferenc Krausz & Kosmas Kepesidis, LMU, Annette Peters, Helmholtz Munich

Abstract

Uncovering the mechanisms of lung remodeling following acute respiratory lung infection using spatial transcriptomics

Abstract

Lung anatomical structures are often severely damaged following an acute respiratory infection, sometimes leading to death in the most extreme cases. Yet, remarkably, the lung demonstrates a significant capacity for regeneration and repair. The mechanisms underlying this resilience remain poorly understood. In this project, we aim to develop computational models that integrate spatial, temporal, and perturbation data to better understand how the lung undergoes remodeling. Situated at the intersection of computational biology, clinical practice, and pathology, this project holds the potential to uncover novel therapeutic strategies to promote lung repair.

Uncovering the mechanisms of lung remodeling following acute respiratory lung infection using spatial transcriptomics

Domain: Medicine & Health / Life Sciences

Supervisors: Malte Lücken, Helmholtz Munich, Emmanuel Saliba, HIRI

Abstract

Multimodal AI Models for Patient Stratification and Prognostic Biomarkers in Osteoarthritis

Abstract

Osteoarthritis (OA) affects over 500 million people worldwide, yet the development of disease-modifying treatments remains a major challenge due to the condition’s heterogeneity, slow progression, and complex regulatory landscape. Within the framework of the PROBE consortium, this PhD project aims to harness advanced artificial intelligence (AI) and multimodal data integration to improve patient stratification, prognosis, and the identification of novel endpoints for OA clinical trials. The candidate will analyze deeply phenotyped datasets spanning clinical, imaging, and multi-omics modalities, leveraging a secure federated data platform developed within PROBE. Using methods ranging from advanced probabilistic models such as multi-omics factor analysis to multi-modal AI models trained with contrastive learning, the project will generate latent representations that act as composite biomarkers, capturing the complexity of OA progression. These representations will enable the unsupervised identification of patient subgroups, prediction of disease trajectories, and translation into scalable proxy biomarkers for larger cohorts. Close collaboration with consortium partners and iterative alignment with uni-modal foundation models developed in other PROBE workpackages will ensure methodological robustness and clinical relevance. Ultimately, this project will deliver both methodological advances in AI for multimodal biomedical data and translational insights to guide therapeutic development, supporting more effective and patient-centered treatment strategies for OA.

Multimodal AI Models for Patient Stratification and Prognostic Biomarkers in Osteoarthritis

Domain: Life Sciences

Supervisors: Matthias Heinig, Helmholtz Munich, Elefteria Zeggini, Helmholtz Munich

Abstract

Decoding and targeting the PDAC ecosystem DEFEAT-PDAC

Pancreatic ductal adenocarcinoma (PDAC) is one of the deadliest and most therapy-resistant cancers, characterized by late detection, rapid metastasis, and poor response to conventional treatments. Standard chemotherapy has remained largely unchanged over the past three decades, offering only marginal survival benefits (6–12 months) and causing severe side effects. Most patients are ineligible for surgery, and recurrence is almost inevitable. While immunotherapies and targeted treatments have advanced outcomes for other cancers, they have shown limited success in PDAC due to its complex and adaptable tumor ecosystem, composed of cancer cells, fibroblasts, and immune cells that collectively promote resistance and aggression. However, recent scientific and technological breakthroughs offer renewed hope. Innovations such as AI-guided drug discovery, protein and cell engineering, and single-cell analytics are opening new therapeutic avenues. Notably, the development of RAS inhibitors, once considered undruggable, and evidence linking T cell activity to long-term survival suggest promising immunotherapeutic directions. Nonetheless, these approaches have so far benefitted only a small subset of patients, underscoring the need for deeper mechanistic insights and individualized strategies.

Decoding and targeting the PDAC ecosystem DEFEAT-PDAC

Domain: Medicine & Health

Supervisors: Fabian Theis, Helmholtz Munich/TUM, Dieter Saur, TUM

Abstract

Deep Learning Integration of Genomic Sequences, Transcriptomics and Interaction Networks for Phenotype Prediction in Eukaryotes - PhenoPred

Predicting phenotypes from genotypes is a grand challenge of biology with substantial translational implications. Using deep learning, we have already made significant progress in capturing genotype-phenotype relationships in prokaryotes. Due to the unique amount of phenotypic and molecular data for essentially all molecular modalities, S. cerevisiae is the ideal model to translate these achievements to eukaryotes. We propose to develop and validate a multi-modal deep learning framework that builds on genomic sequences, transcriptomes, environmental parameters, and regulatory and physical interaction networks to predict growth phenotypes and cell cycle properties. Powerful, AI-ready datasets, e.g. describing 200 distinct phenotypes for 4,000 deletion mutants and unpublished annotated time-lapse imaging data, enable model training and validation. For robust integration across molecular modalities, we will develop new deep learning architectures for complex multimodal data, including hierarchical attention networks to capture relationships in molecular networks. By integrating foundation models pre-trained on large datasets, we will obtain immediate insights into yeast phenotypes and establish a scalable foundation for broader applications. Beyond immediate applications in infection research and biotechnology, our model will set the stage for future expansion towards human cells using transfer learning approaches and subsequent integrative models towards predictive medicine. 

Deep Learning Integration of Genomic Sequences, Transcriptomics and Interaction Networks for Phenotype Prediction in Eukaryotes - PhenoPred

Domain: Life Sciences

Supervisors: Pascal Falter-Braun, Helmholtz Munich/LMU, Kurt Schmoller, Helmholtz Munich

Abstract

Genomic traffic control – mapping and preventing transcription-replication conflicts in cancer genomes

Cells must copy their DNA while simultaneously transcribing it, creating a fundamental scheduling problem: replication and transcription use large, processive molecular machines that often compete for access to the same template. When these machineries collide, transcription–replication conflicts (TRCs) can arise, contributing to replication stress and DNA damage, which can result in diseases like cancer. Yet we still lack genome-wide methods to map and predict where such conflicts occur. 
This project takes a computational approach to investigate TRCs. We will integrate diverse multi-omic datasets on transcriptional activity and replication dynamics to build a predictive “roadmap” of how these processes interact. In parallel, we will generate ground-truth data with a novel protocol based on Nanopore long-read sequencing technology called TRC-seq, for which we will develop new computational tools for their analysis.  
Our core innovation is a hybrid framework: an agent-based simulator of replisome–RNA polymerase II (RNAPII) dynamics coupled with machine learning models trained on genomic sequence and multi-omic tracks. By iteratively fitting simulations to experimental data through inference techniques, we will obtain both mechanistic insight and predictive power. Finally, applying these models to cancer genomes will reveal how TRCs drive mutational patterns and epigenomic alterations. 
Beyond the biological problem, this project addresses major computational challenges in modeling stochastic genome-scale processes, integrating heterogeneous omics data, and analyzing single-molecule long-read sequencing. It aims to establish generalizable methods at the interface of computational biology, machine learning, and genome science. 

Genomic traffic control – mapping and preventing transcription-replication conflicts in cancer genomes

Domain: Life Sciences

Supervisors: Antonio Scialdone, Helmholtz Munich/LMU,  Stephan Hamperl, Helmholtz Munich