NCMM Tuesday Seminar: Jussi Taipale

Invited speaker Professor Jussi Taipale will present at the NCMM Tuesday Seminar on the topic “Towards predicting gene expression from sequence”.

Image may contain: Flash photography, Electric blue

Towards predicting gene expression from sequence

Understanding the information encoded in the human genome requires two genetic codes, the first code specifies how mRNA sequence is converted to protein sequence, and the second code determines when and where the mRNAs are expressed. Although the proteins that read the second, regulatory code – transcription factors (TFs) – have been largely identified, the code is poorly understood. In other words, we still cannot effectively predict when and where genes are expressed based on their DNA sequence. Our solution to this problem is the application of overwhelming experimental force combined with advanced computational methods. For this purpose, we have generated several genome-scale datasets, including sequence-specific binding affinities of human TFs to unmodified and epigenetically modified DNA. We have also begun to identify the major unknown factors in our quantitative understanding of transcription by performing several experiments that bridge the gap between in vivo analyses such as eQTLs, RNA-seq and ChIP-seq and in vitro studies such as SELEX. These approaches include analysis of TF binding to genomic variants, TF binding in the presence of the nucleosome, determining DNA-binding activities of all TFs from distinct cell types, and measuring transcriptional activities of TF motifs, genomic sequences and fully random DNA sequences in vivo. In particular, the random DNA experiments enable analysis of sequence-space that is several orders of magnitude larger than that of the human genome. We believe that application of machine-learning methods to such datasets will enable a full "predictive understanding" of gene expression: determining the rules that would be both understandable at a conceptual level by humans and sufficient for computational generation of accurate quantitative predictions.

References:

Wei et al., A protein activity assay to measure global transcription factor activity reveals determinants of chromatin accessibility. Nature Biotechnology, 2018 (doi: 10.1038/nbt.4138)

Zhu et al., The interaction landscape between transcription factors and the nucleosome. Nature, 2018 (DOI: 10.1038/s41586-018-0549-5)

Yan et al., Systematic analysis of binding of transcription factors to noncoding variants. Nature, 2021 (DOI: 10.1038/s41586-021-03211-0)

Sahu et al., Sequence determinants of human gene regulatory elements. Nature Genetics, 2022 (DOI: 10.1038/s41588-021-01009-4)

Hartonen et al., PlotMI: visualization of pairwise interactions and positional preferences learned by a deep learning model from sequence data. BioRxiv, 2021. (https://doi.org/10.1101/2021.03.14.435285).

Short bio

Professor Jussi Taipale obtained his Ph.D. at the University of Helsinki in 1996 and continued at the University of Helsinki for his post doctorate before moving to Johns Hopkins University (Baltimore, MD, USA). Since 2003, he has headed an independent research laboratory focusing on systems biology of growth control and cancer. He has published more than 100 scientific articles of which 22 are in the most prestigious scientific journals (Nature, Science and Cell), won numerous awards and grants (e.g., Anders Jahre Prize for Young Researchers, EMBO Young Investigator, ERC Advanced Grant and Vetenskapsrådet Distinguished Professor Program) and is internationally recognized as a leader in the field of genomics and systems biology. In 2012, Professor Taipale was elected as Member of the Nobel Assembly at the Karolinska Institutet, which awards the Nobel Prize in Physiology or Medicine. In 2017, he took up the position of Herchel Smith Professor of Biochemistry at the University of Cambridge, UK and also maintains research groups at University of Helsinki, Finland and Karolinska Institutet, Sweden.

The Taipale group’s main expertise is in high-throughput biology – particularly in combining both experimental and computational approaches. The principal aim of the group is to understand two systems-level questions that are presently poorly understood: the mechanisms that control growth of tissues and organisms, and the rules that specify how DNA sequence determines when and where genes are expressed. The group has experience in high throughput screening using cDNA (Varjosalo et al. Cell 2008), RNA interference (Björklund et al. Nature 2006) and CRISPR (Haapaniemi et al., Nature Medicine 2018), and computational and experimental methods to identify causative regulatory variants and mutations (see Yin et al., Science 2017; Zhu et al., Nature 2018; Jolma et al., Cell 2013 and Nature 2015; Yan et al., Cell 2013 and Nature 2021). In addition, the Taipale group has extensive expertise on mouse models of gene and regulatory region function (see Dumont et al., Science 1998; Ma et al., Cell 2002; Hallikas et al. Cell 2006; Sur et al., Science 2012).

Published Apr. 25, 2023 2:12 PM - Last modified May 12, 2023 2:58 PM