Vis enkel innførsel

dc.contributor.authorBurger, Bram
dc.date.accessioned2021-06-07T13:28:21Z
dc.date.available2021-06-07T13:28:21Z
dc.date.issued2021-06-10
dc.date.submitted2021-05-21T10:31:05.455Z
dc.identifiercontainer/c9/28/b6/b9/c928b6b9-f85a-4273-8997-8fba59fd6013
dc.identifier.isbn9788230845288
dc.identifier.isbn9788230866023
dc.identifier.urihttps://hdl.handle.net/11250/2758248
dc.description.abstractThe methods to study proteins are continuously improving, making it possible to identify and study increasingly more proteins. It is important to keep studying and improving the way experiments are designed, performed, and interpreted. This thesis concerns statistical considerations for the design and interpretation of proteomics experiments using mass spectrometry. Mainly the focus is on protein and pathway interactions and networks, extending an existing workflow to additionally identify very small peptides and single amino acids, and effectively making balanced batches in a fixed cohort. The biological functions of proteins are determined by their interactions with other molecules. While most proteins have a limited number of specific interaction partners, the whole system of these interactions is part of what defines organisms. Interactions between proteins, and with other molecules, can be grouped in different types of reactions. A set of actions and reactions leading to a specific outcome is a pathway, which defines a, more or less specific, process. Results from statistical analyses of protein abundances are often put in a biological context by way of mapping proteins to these pathways, i.e. by pathway analysis. The interactions between proteins and the definitions of pathways are collected in pathway databases, based on available literature or computational inference. Due to the central role these databases play in the interpretation of proteomics experiments, they can directly influence the outcome of pathway analysis algorithms. We investigated the structure and evolution of one of the manually curated pathway databases, by using network statistics focusing on the connectivity of the annotated proteins and the hierarchical nature of the pathways. Additionally, we hope to aid in improving the understanding of the underlying basis for pathway analysis results and give practical advice for their interpretation. Several pathways involve the digestion of proteins. This is done not only to extract energy from food, but also in order to modify proteins by proteolytic cleavage or to recycle proteins so their amino acids can be reused in other proteins. Additionally, small peptides and single amino acids can have functions of their own and play an important role in maintaining homeostasis. Differential abundance of endogenous (i.e. naturally occurring) peptides can thus potentially be used as an indication of possible dysfunctional proteases, peptidases, or proteolytic pathways. Using mass spectrometry only molecules that can attract a charge will be possible to detect. Many amino acids and very small peptides are not able to attract a charge under standard conditions in mass spectrometry. Additionally, singly charged molecules are often ignored as they are likely to be contamination. While there are kits to specifically detect single amino acids and some small peptides using mass spectrometry, these are not easily incorporated into standard workflows. In a proof-of-concept experiment we used isobaric tags in a standard peptidomics workflow. As the tags acquire a charge themselves, this makes it possible to detect the single amino acids and very small peptides when singly charged molecules are also selected for fragmentation. The identification of the isobaric tag in the MS\textsuperscript{2} spectra serves as a rudimentary check that the molecule is not a contaminant. This simple addition to a standard workflow, combined with a naive mass-based identification approach, showed that with a limited amount of extra work a large amount of single amino acids and small endogenous peptides could be identified, which could lead to potential novel insights and understanding. Processing of samples in biomedical experiments generally proceeds sequentially. Due to time and technical limitations, the number of samples that can be processed at once is often limited to less than the number of available samples, which means that at least some steps of the experiment will have to be processed in different batches. It then becomes important to make sure that the batching process does not introduce confounders. For smaller experiments this is feasible to do by hand for expert scientists, but quickly becomes cumbersome and time-consuming for large and imbalanced cohort set-ups. To aid in the process of designing batch allocations that are as balanced as possible, given the available samples, we have developed a fast and intuitive heuristic algorithm, which can be applied to single-variable model where the treatment variable is nominal. This automated procedure can free researchers to focus on other aspects of the experiment, and provides a marked improvement over a more naive random allocation procedure.en_US
dc.language.isoengen_US
dc.publisherThe University of Bergenen_US
dc.relation.haspartPaper I: Burger B., Hern_andez S_anchez L.F., Lereim R.R., Barsnes H., Vaudel M. Analyzing the structure of pathways and its inuence on the interpretation of biomedical proteomics data sets. Journal of Proteome Research, 2018, 17, 11, 3801-3809. The accepted version is available at: <a href="https://hdl.handle.net/1956/22539" target="blank">https://hdl.handle.net/1956/22539</a>en_US
dc.relation.haspartPaper II: Burger B.§, Lereim R.R.§, Berven F.S., Barsnes H. Detecting single amino acids and small peptides by combining isobaric tags and peptidomics. European Journal of Mass Spectrometry (Chichester), 2019, 25, 6, 451-456. The accepted version is available at: <a href="https://hdl.handle.net/1956/22962" target="blank">https://hdl.handle.net/1956/22962</a>en_US
dc.relation.haspartPaper III: Burger B., Vaudel, M., Barsnes, H. Automated blocking and batching of biomedical experiments with sequential processing. The article is not available in BORA.en_US
dc.rightsAttribution (CC BY). This item's rights statement or license does not apply to the included articles in the thesis.
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.titleStatistical considerations for the design and interpretation of proteomics experimentsen_US
dc.typeDoctoral thesisen_US
dc.date.updated2021-05-21T10:31:05.455Z
dc.rights.holderCopyright the Author.en_US
dc.contributor.orcid0000-0003-1093-1290
dc.description.degreeDoktorgradsavhandling
fs.unitcode12-12-0


Tilhørende fil(er)

Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel

Attribution (CC BY). This item's rights statement or license does not apply to the included articles in the thesis.
Med mindre annet er angitt, så er denne innførselen lisensiert som Attribution (CC BY). This item's rights statement or license does not apply to the included articles in the thesis.