Unraveling the roles of mRNA transcript leader sequences on gene expression
Eukaryotic gene expression undergoes tight regulation during mRNA-to-protein translation, with dysregulation potentially resulting in ribosome stalling, nonsense-mediated decay, or abnormal protein accumulation, thereby compromising mRNA integrity. These disruptions contribute to genomic imbalances and can result in neurodegenerative diseases. Initial investigations on model genes have established the fundamental roles of mRNA 5' transcript leader (TL) sequences in controlling ribosome recruitment, scanning, and translation initiation. The typical model for eukaryotic translation involves cap-dependent directional scanning, where TL cis-regulatory elements and corresponding trans-acting factors govern cap-dependent initiation under normal conditions. However, under stress, cap-dependent initiation is generally suppressed, and specific mRNA structures and sequences facilitate the translation of stress-responsive transcripts, remodeling the proteome. In some instances, cap-independent structural elements promote ribosome recruitment, evading the canonical method of translation. While the general properties of TL cis-regulatory elements are known, there are still many unanswered questions about their relative contributions to gene expression.
The main objective of this project is to investigate the functional and relative impact of mRNA features on gene regulation and expression. My central hypothesis is that 5' TLs possess distinct sequence features that significantly influence translation efficiency, resulting in genes having different regulatory roles and trajectories. To address this, my thesis work combines high-throughput experiments and computational methods to elucidate the roles of 5' transcript leaders (TLs) in gene expression regulation.
To study translation, we employ two massively parallel reporter systems, namely Fluorescence-Activated Cell Sorting (FACS-Seq) and Polysome Library Sorting (PoLib-Seq), to assess the in vivo effects of 5' TL cis-regulatory elements on gene expression. Additionally, we employ computational modeling techniques to quantify the relative influences of 5’ TL features and mRNA structures. We also investigate yeast upstream open reading frames (uORFs) and quantify the repressive features associated with them. Furthermore, we extensively analyze putative cap-independent structural elements known as internal ribosome entry sites (IRESes). Through these investigations, I aim to provide a comprehensive understanding of translational control and address aspects of translational control that have been left unanswered.
The dissertation is organized as follows:
- Chapter 1: Introduction - This chapter reviews the roles of 5' TLs in translation and summarizes the current knowledge of TL functions in translation initiation. It covers ribosome recruitment, 5’ TL scanning, translation initiation, and experimental and computational analyses of 5' TL cis- and trans-acting features. Key topics include mRNA start codons, Kozak sequences, uORFs, and mRNA structures. Additionally, it explores known cases of 5' TLs containing upstream AUG (uAUG) and their function in yeast and mammals. We also focus on insights gathered from ribosome occupancy, massively parallel reporter studies, and computational models that have paved the way for a comprehensive understanding of TL functions. The chapter concludes by discussing areas for future research on the roles of mRNA sequences and structures in translation. Overall, this section provides a thorough background for subsequent chapters.
- Chapter 2 - This section focuses on the systematic study of in vivo gene expression in yeast 5' TLs. To achieve this, we use massively parallel reporter assays to measure 86% of yeast 5’ TLs. First, I analyze the impacts of alternative transcription start sites on gene expression. The chapter then delves into modeling ribosome scanning using start codon efficiencies and uAUGs. I then use reporter data to define and quantify additional 5’ TL features such as cap structure, start codon structures, and RNA binding motifs. Finally, I use machine learning to determine the relative strengths of cis-regulatory elements on gene expression in yeast.
- Chapter 3 - uORFs are major regulators of gene expression, although their sequence features have not been systematically analyzed. Thus, we zoom in on upstream open reading frames (uORFs) and quantitatively analyze their sequence features. Thousands of yeast uORFs are tested via FACS sorting to measure their impact on gene expression. The chapter specifically focuses on uORF structure, positional features, and uORF impacts on NMD. Combining all AUG uORF data, we then construct a computational model that highlights key features and defines the scope of uORF activity in yeast.
- Chapter 4 – Previous works proposed that developmental genes use cap-independent methods for translation initiation. To gain a holistic understanding of translation, we aimed to study the significance of previously identified mammalian hyperconserved 5’ TLs and putative IRESes. The proposed IRES structure for the developmental gene Hoxa9 is analyzed to determine its conservation across species. Ultimately our analysis reveals that Hoxa9 is misannotated and is a DNA element rather than mRNA. Other proposed IRES-containing sequences are examined using public data, leading to the discovery that many putative IRESes are false positives. This chapter emphasizes the need for appropriate experimental setups, controls, and accurate 5' annotations.
Together, these chapters investigate the multifaceted roles of 5' transcript leaders (TLs) in gene expression regulation by combining high-throughput experiments and computational modeling. First, we quantify the influences of Kozak strengths, uORF features, and mRNA structure along the 5' TL. Lastly, our analysis of mammalian TLs reveals flaws in commonly used reporter assays studying cap-independent methods of translation. In conclusion, by unraveling the influence of 5’ mRNA features, this research provides a deeper understanding of translation regulation and its impact on gene expression.
History
Date
2023-06-02Degree Type
- Dissertation
Department
- Biological Sciences
Degree Name
- Doctor of Philosophy (PhD)