Development and application of computational methods for NGS-based microbiome research
Abstract
The advance of DNA sequencing technologies has dramatically expanded our knowledge of microbial community composition and their functions from diverse environments. The most common Next Generation Sequencing (NGS)-based methods used for this purpose are marker genes (16S ribosomal RNA (rRNA), 18S rRNA and Internal transcribed spacer (ITS)), metagenome, and metatranscriptome, which all have wide applications with different prominence. Meanwhile, numerous bioinformatic tools and workflows have been developed for a complete and comprehensive analysis of above approaches, which makes it relatively easy to achieve basic results with standard procedure. However, current workflows can only provide generic analyses for well-studied environments, and the choice of methods affect results significantly. In this thesis, I explore best analytical practices and address bioinformatic challenges in NGS-based microbiome research, with emphasis on low-biomass and poorly characterized environments.
My work constitutes a combination of both advanced applied bioinformatics and sophisticated method development. Paper I and II investigated microbial community composition in human obstructive lung diseases through marker gene sequencing, exploring the best sampling procedures to avoid contamination of airway microbiomes. Paper III and IV contributed to the microbial composition and functional potentials in permafrost metagenomic samples, applying novel bioinformatic methods in a metagenome-assembled genomes (MAGs) centric view. Paper V described the new MetaRib tool for reconstructing full-length ribosomal gene sequences from the large-scale metatranscriptomic datasets. MetaRib was able to perform fast rRNA reconstruction across multiple samples with a low false positive rate, even in very large datasets, in addition it provides accurate taxonomy-independent relative abundance estimation.
Has parts
Paper I: Grønseth, R., Drengenes, C., Wiker, H.G., Tangedal, S., Xue, Y., Husebø, G.R., Svanes, Ø., Lehmann, S., Aardal, M., Hoang, T. and Kalananthan, T., 2017. Protected sampling is preferable in bronchoscopic studies of the airway microbiome. ERJ open research, 3(3):00019-2017. The article is available in the main thesis. The article is also available at: https://doi.org/10.1183/23120541.00019-2017Paper II: Grønseth, R., Xue, Y., Jonassen, I., Haaland, I., Kommedal O., Wiker, H. G., Drengenes, C., Bakke, P., & Eagan, T. Repealed bronchoscopy in health and obstructive lung disease: Is the airway microbiome stable? Full text not avialable in BORA.
Paper III: Xue, Y., Jonassen, I., Øvreås, L. and Taş, N., 2019. Bacterial and Archaeal Metagenome-Assembled Genome Sequences from Svalbard Permafrost. Microbiology resource announcements, 8(27):e00516-19. The article is available in the main thesis. The article is also available at: https://doi.org/10.1128/MRA.00516-19
Paper IV: Xue, Y., Jonassen, I., Øvreås, L. and Taş, N., 2020. Metagenome-assembled genome distribution and key functionality highlight importance of aerobic metabolism in Svalbard permafrost. FEMS microbiology ecology, 96(5):fiaa057. The article is available in the main thesis. The article is also available at: https://doi.org/10.1093/femsec/fiaa057
Paper V: Xue, Y., Lanzén, A. and Jonassen, I., 2020. Reconstructing ribosomal genes from large scale total RNA meta-transcriptomic data. Bioinformatics, 36(11):3365-3371. The article is available in the main thesis. The article is also available at: https://doi.org/10.1093/bioinformatics/btaa177