Abstract
Perennial ryegrass (Lolium perenne), one of the most widely used forage and cool-season turfgrass worldwide, has a breeding history of more than 100 years. However, the current draft genome annotation and transcriptome characterization are incomplete mainly because of the enormous difficulty in obtaining full-length transcripts. To explore the complete structure of the mRNA and improve the current draft genome, we performed PacBio single-molecule long-read sequencing for full-length transcriptome sequencing in perennial ryegrass. We generated 29,175 high-confidence non-redundant transcripts from 15,893 genetic loci, among which more than 66.88% of transcripts and 24.99% of genetic loci were not previously annotated in the current reference genome. The re-annotated 18,327 transcripts enriched the reference transcriptome. Particularly, 6709 alternative splicing events and 23,789 alternative polyadenylation sites were detected, providing a comprehensive landscape of the post-transcriptional regulation network. Furthermore, we identified 218 long non-coding RNAs and 478 fusion genes. Finally, the transcriptional regulation mechanism of perennial ryegrass in response to drought stress based on the newly updated reference transcriptome sequences was explored, providing new information on the underlying transcriptional regulation network. Taken together, we analyzed the full-length transcriptome of perennial ryegrass by PacBio single-molecule long-read sequencing. These results improve our understanding of the perennial ryegrass transcriptomes and refined the annotation of the reference genome.
Similar content being viewed by others
Data availability
The PacBio sequencing reads (accession number PRJNA549115) and the Illumina SGS reads (accession number PRJNA566226) generated in this study have been submitted to the BioProject database of National Center for Biotechnology Information.
Abbreviations
- APA:
-
Polyadenylation sites
- AS:
-
Alternative splicing events
- CDS:
-
Coding sequences
- CPAT:
-
Coding potential assessment tool
- CPC:
-
Coding potential calculator
- CNCI:
-
Coding–non-coding index
- FLNC:
-
Full-length non-chimeric reads
- GO:
-
Gene ontology
- HQ:
-
High-quality isoforms
- ICE:
-
Iterative isoform-clustering program
- KEGG:
-
Kyoto Encyclopedia of Genes and Genomes
- KOG:
-
EuKaryotic orthologous groups
- lncRNA:
-
Long non-coding RNA
- LQ:
-
Low-quality isoforms
- NFL:
-
Non-full-length
- NGS:
-
Next-generation sequencing
- Nr:
-
NCBI non-redundant proteins
- ROI:
-
Reads of insert
- ORF:
-
Open reading frames
- PacBio sequencing:
-
The PacBio single-molecule long-read sequencing technology
- Pfam:
-
A database of conserved Protein families or domains
- RT-PCR:
-
Reverse transcription polymerase chain reaction
- Swissprot:
-
A manually annotated, non-redundant protein database
References
Abdelghany SE, Hamilton M, Jacobi JL, Ngam P, Devitt N, Schilkey F, Benhur A, Reddy ASN (2016) A survey of the sorghum transcriptome using single-molecule long reads. Nat Commun 7:11706
Anders S, Huber W (2010) Differential expression analysis for sequence count data. Genome Biol 11:R106
Beier S, Thiel T, Münch T, Scholz U, Mascher M (2017) MISA-web: a web server for microsatellite prediction. Bioinformatics 33:2583–2585
Byrne SL, Nagy I, Pfeifer M, Armstead I, Swain S, Studer B, Mayer K, Campbell JD, Czaban A, Hentrup S (2016) A synteny-based draft genome sequence of the forage grass Lolium perenne: for cell and molecular biology. Plant J 84:816–826
Chao Y, Yuan J, Li S, Jia S, Han L, Xu L (2018) Analysis of transcripts and splice isoforms in red clover (Trifolium pratense L.) by single-molecule long-read sequencing. BMC Plant Biol 18:300
Chao Y, Yuan J, Guo T, Xu L, Mu Z, Han L (2019) Analysis of transcripts and splice isoforms in Medicago sativa L. by single-molecule long-read sequencing. Plant Mol Biol 99:219–235
Chen X, Liu X, Zhu S, Tang S, Mei S, Chen J, Li S, Liu M, Gu Y, Dai Q, Liu T (2018) Transcriptome-referenced association study of clove shape traits in garlic. DNA Res 25:587–596
Dhindsa RS (1991) Drought stress, enzymes of glutathione metabolism, oxidation injury, and protein synthesis in Tortula ruralis. Plant Physiol 95:648–651
Di C, Yuan J, Wu Y, Li J, Lin H, Hu L, Zhang T, Qi Y, Gerstein MB, Guo Y, Lu ZJ (2014) Characterization of stress-responsive lncRNAs in Arabidopsis thaliana by integrating expression, epigenetic and structural features. Plant J 80:848–861
Dong L, Liu H, Zhang J, Yang S, Kong G, Chu JSC, Chen N, Wang D (2015) Single-molecule real-time transcript sequencing facilitates common wheat genome annotation and grain transcriptome research. BMC Genom 16:1039
Hackl T, Hedrich R, Schultz J, Förster F (2014) proovread: large-scale high-accuracy PacBio correction through iterative short read consensus. Bioinformatics 30:3004–3011
Heo JB, Lee Y-S, Sung S (2013) Epigenetic regulation by long noncoding RNAs in plants. Chromosome Res 21:685–693
Hoagland DR, Arnon DI (1950) The water-culture method for growing plants without soil. Calif Agric Exp Statn 347:357–359
Huff DR (1997) RAPD characterization of heterogenous perennial ryegrass cultivars. Crop Sci 37:557–564
Jianwei L, Wei M, Pan Z, Junyi W, Bin G, Jichun Y, Qinghua C (2015) LncTar: a tool for predicting the RNA targets of long noncoding RNAs. Brief Bioinf 16:806
Langmead B, Salzberg SL (2012) Fast gapped-read alignment with Bowtie 2. Nat Methods 9:357–359
Li B, Dewey CN (2011) RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinf 12:323
Li X-Q, Du D (2014) Motif types, motif locations and base composition patterns around the RNA polyadenylation site in microorganisms, plants and animals. BMC Evol Biol 14:162
Li W, Lin W-D, Ray P, Lan P, Schmidt W (2013) Genome-wide detection of condition-sensitive alternative splicing in arabidopsis roots. Plant Physiol 162:1750–1763
Li Y, Dai C, Hu C, Liu Z, Kang C (2017) Global identification of alternative splicing via comparative analysis of SMRT- and Illumina-based RNA-seq in strawberry. Plant J 90:164–176
Liu S, Jiang Y (2010) Identification of differentially expressed genes under drought stress in perennial ryegrass. Physiol Plant 139:375–387
Nagae M, Parniske M, Kawaguchi M, Takeda N (2016) The relationship between thiamine and two symbioses: root nodule symbiosis and arbuscular mycorrhiza. Plant Signal Behav 11:e1265723
Pan L, Zhang X, Wang J, Ma X, Zhou M, Huang LK, Nie G, Wang P, Yang Z, Li J (2016) Transcriptional profiles of drought-related genes in modulating metabolic processes and antioxidant defenses in Lolium multiflorum. Fron Plant Sci 7:519
Puyang X, An M, Xu L, Han L, Zhang X (2015) Antioxidant responses to waterlogging stress and subsequent recovery in two Kentucky bluegrass (Poa pratensis L.) cultivars. Acta Physiol Plant 37:197
Reguera M, Peleg Z, Abdel-Tawab YM, Tumimbang EB, Delatorre CA, Blumwald E (2013) Stress-induced cytokinin synthesis increases drought tolerance through the coordinated regulation of carbon and nitrogen assimilation in rice. Plant Physiol 163:1609–1622
Robert VB, Doug B, Edger PP, Haibao T, Diane B, Dinakar C, Kristi S, Richard H, Jenny G, Eric L (2015) Single-molecule sequencing of the desiccation-tolerant grass Oropetium thomaeum. Nature 527:508
Shen Y, Zhou Z, Wang Z, Li W, Fang C, Wu M, Ma Y, Liu T, Kong L-A, Peng D-L, Tian Z (2014) Global dissection of alternative splicing in paleopolyploid soybean. Plant Cell 26:996–1008
Shinozaki K, Yamaguchi-Shinozaki K (2006) Gene networks involved in drought stress response and tolerance. J Exp Bot 58:221–227
Shinozuka H, Noi C, Spangenberg GC, Forster JW (2017) Reference transcriptome assembly and annotation for perennial ryegrass. Genome 60:1086
Studer B, Byrne S, Nielsen RO, Panitz F, Bendixen C, Islam MS, Pfeifer M, Lübberstedt T, Asp T (2012) A transcriptome map of perennial ryegrass (Lolium perenne L.). BMC Genom 13:140
Taji T, Ohsumi C, Iuchi S, Seki M, Kasuga M, Kobayashi M, Yamaguchi-Shinozaki K, Shinozaki K (2002) Important roles of drought- and cold-inducible genes for galactinol synthase in stress tolerance in Arabidopsis thaliana. Plant J 29:417–426
Teng K, Tan P, Guo W, Yue Y, Fan X, Wu J (2018) Heterologous Expression of a novel Zoysia japonica C2H2 zinc finger gene, ZjZFN1, improved salt tolerance in Arabidopsis. Front Plant Sci 9:1159
Ugrappa N, Zhong W, Karl W, Chong S, Debasish R, Mark G, Michael S (2008) The transcriptional landscape of the yeast genome defined by RNA sequencing. Science 320:1344–1349
Wang B, Tseng E, Regulski M, Clark TA, Hon T, Jiao Y, Lu Z, Olson A, Stein JC, Ware D (2016) Unveiling the complexity of the maize transcriptome by single-molecule long-read sequencing. Nat Commun 7:11708
Wang K, Liu Y, Tian J, Huang K, Shi T, Dai X, Zhang W (2017a) Transcriptional profiling and identification of heat-responsive genes in perennial ryegrass by RNA-sequencing. Front Plant Sci 8:1032
Wang T, Wang H, Cai D, Gao Y, Zhang H, Wang Y, Lin C, Ma L, Gu L (2017b) Comprehensive profiling of rhizome-associated alternative splicing and alternative polyadenylation in moso bamboo (Phyllostachys edulis). Plant J 91:684–699
Wang M, Wang P, Liang F, Ye Z, Li J, Shen C, Pei L, Wang F, Hu J, Tu L, Lindsey K, He D, Zhang X (2018) A global survey of alternative splicing in allopolyploid cotton: landscape, complexity and regulation. New Phytol 217:163–178
Wu T, Watanabe C (2005) GMAP: a genomic mapping and alignment program for mRNA and EST sequences. Bioinformatics 21:1859
Wu X, Liu M, Downie B, Liang C, Ji G, Li QQ, Hunt AG (2011) Genome-wide landscape of polyadenylation in Arabidopsis provides evidence for extensive alternative polyadenylation. Proc Natl Acad Sci 108:12533–12538
Yang L, Duff MO, Graveley BR, Carmichael GG, Chen L-L (2011) Genomewide characterization of non-polyadenylated RNAs. Genome Biol 12:R16
Zhang B, Liu J, Wang X, Wei Z (2018a) Full-length RNA sequencing reveals unique transcriptome composition in bermudagrass. Plant Physiol Biochem 132:95–103
Zhang N, Han L, Xu LX, Zhang XZ (2018b) Ethephon seed treatment impacts on drought tolerance of kentucky bluegrass seedlings. HortTechnology 28:319–326
Zhu F-Y, Chen M-X, Ye N-H, Shi L, Ma K-L, Yang J-F, Cao Y-Y, Zhang Y, Yoshida T, Fernie AR, Fan G-Y, Wen B, Zhou R, Liu T-Y, Fan T, Gao B, Zhang D, Hao G-F, Xiao S, Liu Y-G, Zhang J (2017) Proteogenomic analysis reveals alternative splicing and translation as part of the abscisic acid response in Arabidopsis seedlings. Plant J 91:518–533
Zhu C, Li X, Zheng J (2018) Transcriptome profiling using Illumina- and SMRT-based RNA-seq of hot pepper for in-depth understanding of genes involved in CMV infection. Gene 666:123
Acknowledgements
We are very grateful to Prof. Luis A. J. Mur from Institute of Biological, Environmental and Rural Sciences, Aberystwyth University for critically discussion with the manuscript. We also thank Biomarker Technology Corporation (Beijing, China) for the facilities and expertise of PacBio platform for libraries construction and sequencing and the Editage Company (https://www.editage.com) for language editing.
Funding
This research was supported by the Scientific Technology Plan Program of Shenzhen (No. JCYJ20160331151245672)‚ the National Natural Science Foundation of China (No. 31971770 and No. 31901397) and Beijing Natural Science Foundation (No.6204039).
Author information
Authors and Affiliations
Contributions
Conceived and designed the experiments: LH and YC. Performed the experiments: LX, KT and PT. Data analysis and draft of the manuscript were performed by KT, YL and WG. All authors approved the final version of the manuscript for submission.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no conflict of interest.
Research involving human participants and/or animals
This study does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Communicated by Stefan Hohmann.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
438_2019_1635_MOESM2_ESM.jpg
Fig.S2 Physiological measurement of perennial ryegrass seedlings under different drought stresses conditions. (A) Leaf relative water content. (B) MDA content. (C) Proline content. (D) Total sugar content. Means ± SDs (n = 4). Different letters indicate significant differences at 5% level of probability. (JPG 469 kb)
438_2019_1635_MOESM3_ESM.tif
Fig.S3 GO annotation of the biological processes identified in Drought_0-3d (A), Drought_0-8d (B) and Drought_3-8d (C). (TIFF 334 kb)
Rights and permissions
About this article
Cite this article
Xie, L., Teng, K., Tan, P. et al. PacBio single-molecule long-read sequencing shed new light on the transcripts and splice isoforms of the perennial ryegrass. Mol Genet Genomics 295, 475–489 (2020). https://doi.org/10.1007/s00438-019-01635-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00438-019-01635-y