Automatic Construction of Gene Regulatory Networks from Scientific Literature with LLMs

Timing: 2024/2025
Funding: TRA Modelling (University of Bonn) as part pf the Excellence Strategy of the federal and state governments

About the Project: "AUTOMATIC CONSTRUCTION OF GENE REGULATORY NETWORKS FROM SCIENTIFIC LITERATURE WITH LLMs" 

The project will be conducted in consultation with Dr. Christiane Hellweg (DLR, German Aerospace Centre) and Prof. Dr. Holger Fröhlich (b-it). Prof. Fröhlich conducts research in statistical data mining and machine learning with specific focus on applications in biomedicine. Dr. Hellweg’s research focuses on the effects of radiation on organisms, its possible uses in cancer therapy and the disruption of gene regulation it causes.

Gene regulatory networks describe the interactions of genes in proteins in living organisms. Disruption of those networks can cause a plethora of health problems. The most prevalent one being cancer, which is always based on a disruption of gene regulation.

The biomedical community does a tremendous amount of research about cancer and the underlying genetic causes, which can differ strongly between cancer types. So many papers are published that it becomes impossible to keep an overview. Furthermore, many papers explore connections between only a few genes under specific experimental conditions and not full networks. Leveraging the text comprehension ability of LLMs, the goal is to extract and then combine the partial networks described in papers based on matching experimental conditions.

DATA SCIENCE AND LANGUAGE TECHNOLOGIES GROUP
Principal Investigators: Prof. Dr. Lucie Flek
Team: Frederik Labonte