Vis enkel innførsel

dc.contributor.authorBalzer, Susanne Mignoneng
dc.date.accessioned2013-10-30T12:11:30Z
dc.date.available2013-10-30T12:11:30Z
dc.date.issued2013-06-17eng
dc.identifier.isbn978-82-308-2324-8en_US
dc.identifier.urihttps://hdl.handle.net/1956/7455
dc.description.abstractThe introduction of this thesis provides background knowledge on the 454 sequencing technology and a detailed review of the most relevant sequencing artifacts. Chapter 1 puts the 454 sequencing technology into a historical context. Chapter 2 gives an overview of where 454 sequencing is applied, focusing on the most common application areas. Chapter 3 provides a detailed description of how 454 sequencing works, from library preparation to sequencing, imaging and data output. Here, the distinction between the different detail levels of sequencing information is crucial since data aggregation involves information loss. Chapter 4 describes where errors and artifacts can arise, how they are manifested in the sequencing data, and what impact they can have on downstream analyses. Finally, Chapter 5 puts the contributions into their respective analytical contexts and discusses their relevance for the research community. The first paper, published in Bioinformatics in September 2010 and presented at the European Conference on Computational Biology (ECCB) in Belgium the same year, comprises of the exploration, modeling and simulation of 454 data. Under the title “Characteristics of 454 pyrosequencing data – enabling realistic simulation with Flowsim”, we present a detailed analysis of sequencing data and a simulation tool that facilitates the design of sequencing projects. The tool can be used to examine and quantify the impact of read length, coverage, sequencing errors and signal degradation on genome assembly. Furthermore, it enables the testing and benchmarking of known and novel algorithms, methods and tools in a number of application areas such as whole genome assembly, read alignment, read correction, single-nucleotide polymorphism (SNP) identification and metagenomics. The second paper, “Systematic exploration of error sources in pyrosequencing flowgram data”, was published in Bioinformatics in July 2011 and presented at the Intelligent Systems for Molecular Biology (ISMB)/ECCB conference in Austria the same year. We added several features and modules to the existing simulation pipeline. Those were based on the observation of several error sources such as copy ing errors introduced through polymerase chain reaction (PCR), a method used in 454 sequencing for amplification of the templates. These errors appear as mutations and are virtually impossible to distinguish from true sequence variants. Similar to the second paper, the third paper, “Filtering duplicate reads from 454 pyrosequencing data”, focuses on a single error type, namely artificially duplicated reads. Our JATAC tool enables removal of this artifact on the most detailed sequencing data level, outperforming existing tools. The paper was published in Bioinformatics in April 2013.en_US
dc.language.isoengeng
dc.publisherThe University of Bergenen_US
dc.relation.haspartPaper I: Characteristics of 454 pyrosequencing data – enabling realistic simulation with flowsim, Balzer S, Malde K, Lanzén A, Sharma A and Jonassen I. Published in Bioinformatics 2010, Volume 26, Issue 18, Pp. i420-i425. The article is available at: <a href="http://hdl.handle.net/1956/7456" target="blank">http://hdl.handle.net/1956/7456</a>en_US
dc.relation.haspartPaper II: Systematic exploration of error sources in pyrosequencing flowgram data, Balzer S, Malde K and Jonassen I. Published in Bioinformatics 2011, Volume 27, Issue 13, Pp. i304-i309. The article is available at: <a href="http://hdl.handle.net/1956/7457" target="blank">http://hdl.handle.net/1956/7457</a>en_US
dc.relation.haspartPaper III: Filtering duplicate reads from 454 pyrosequencing data, Balzer S, Malde K, Grohme MA and Jonassen . Published in Bioinformatics 2013, Volume 29, Issue 7 Pp. 830-836. The article is available at: <a href="http://hdl.handle.net/1956/7458" target="blank">http://hdl.handle.net/1956/7458</a>en_US
dc.titleCharacteristics of Pyrosequencing Data – Analysis, Methods, and Toolsen_US
dc.typeDoctoral thesis
dc.rights.holderCopyright the author. All rights reserveden_US


Tilhørende fil(er)

Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel