Show simple item record

dc.contributor.authorMathai, Neann Sarah
dc.date.accessioned2022-06-28T09:36:07Z
dc.date.issued2021-12-10
dc.date.submitted2021-12-03T11:49:48.969Z
dc.identifiercontainer/ef/37/97/43/ef379743-a91c-4e6e-bf2a-bb8167407f21
dc.identifier.isbn9788230847794
dc.identifier.isbn9788230857410
dc.identifier.urihttps://hdl.handle.net/11250/3001272
dc.description.abstractComputational methods to predict the macromolecular targets of small organic drugs and drug-like compounds play a key role in early drug discovery and drug repurposing efforts. These methods are developed by building predictive models that aim to learn the relationships between compounds and their targets in order to predict the bioactivity of the compounds. In this thesis, we analyzed the strategies used to validate target prediction approaches and how current strategies leave crucial questions about performance unanswered. Namely, how does an approach perform on a compound of interest, with its structural specificities, as opposed to the average query compound in the test data? We constructed and present new guidelines on validation strategies to address these short-comings. We then present the development and validation of two ligand-based target prediction approaches: a similarity-based approach and a binary relevance random forest (machine learning) based approach, which have a wide coverage of the target space. Importantly, we applied a new validation protocol to benchmark the performance of these approaches. The approaches were tested under three scenarios: a standard testing scenario with external data, a standard time-split scenario, and a close-to-real-world test scenario. We disaggregated the performance based on the distance of the testing data to the reference knowledge base, giving a more nuanced view of the performance of the approaches. We showed that, surprisingly, the similarity-based approach generally performed better than the machine learning based approach under all testing scenarios, while also having a target coverage which was twice as large. After validating two target prediction approaches, we present our work on a large-scale application of computational target prediction to curate optimized compound libraries. While screening large collections of compounds against biological targets is key to identifying new bioactivities, it is resource intensive and challenging. Small to medium-sized libraries, that have been optimized to have a higher chance of producing a true hit on an arbitrary target of interest are therefore valuable. We curated libraries of readily purchasable compounds by: i. utilizing property filters to ensure that the compounds have key physicochemical properties and are not overly reactive, ii. applying a similaritybased target prediction method, with a wide target scope, to predict the bioactivities of compounds, and iii. employing a genetic algorithm to select compounds for the library to maximize the biological diversity in the predicted bioactivities. These enriched small to medium-sized compound libraries provide valuable tool compounds to support early drug development and target identification efforts, and have been made available to the community. The distinctive contributions of this thesis include the development and benchmarking of two ligand-based target prediction approaches under novel validation scenarios, and the application of target prediction to enrich screening libraries with biologically diverse bioactive compounds. We hope that the insights presented in this thesis will help push data driven drug discovery forward.en_US
dc.language.isoengen_US
dc.publisherThe University of Bergenen_US
dc.relation.haspartPaper 1: Mathai, N.; Chen, Y.; Kirchmair, J. Validation strategies for target prediction methods, Briefings in Bioinformatics, 2020, 21(3), pp. 791-802. The article is available at: <a href="https://hdl.handle.net/1956/21871" target="blank">https://hdl.handle.net/1956/21871</a>en_US
dc.relation.haspartPaper 2: Mathai, N.; Kirchmair, J. Similarity-based methods and machine learning approaches for target prediction in early drug discovery: performance and scope, International Journal of Molecular Sciences, 2020, 21(10), 3585. The article is available at: <a href="https://hdl.handle.net/11250/2730409" target="blank">https://hdl.handle.net/11250/2730409</a>en_US
dc.relation.haspartPaper 3: Mathai, N.; Stork, C.; Kirchmair, J. BonMOLière: Small-sized libraries of readily purchasable compounds, optimized to produce genuine hits in biological screens across the protein space, International Journal of Molecular Sciences, 2021, 22(15), 7773. The article is available at: <a href="https://hdl.handle.net/11250/2768773" target="blank">https://hdl.handle.net/11250/2768773</a>en_US
dc.rightsAttribution-NonCommercial (CC BY-NC). This item's rights statement or license does not apply to the included articles in the thesis.
dc.rights.urihttps://creativecommons.org/licenses/by-nc/4.0/
dc.titleDevelopment, validation and application of in-silico methods to predict the macromolecular targets of small organic compoundsen_US
dc.typeDoctoral thesisen_US
dc.date.updated2021-12-03T11:49:48.969Z
dc.rights.holderCopyright the Author.en_US
dc.contributor.orcid0000-0002-5763-6304
dc.description.degreeDoktorgradsavhandling
fs.unitcode12-31-0
dc.date.embargoenddate2022-06-10


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record

Attribution-NonCommercial (CC BY-NC). This item's rights statement or license does not apply to the included articles in the thesis.
Except where otherwise noted, this item's license is described as Attribution-NonCommercial (CC BY-NC). This item's rights statement or license does not apply to the included articles in the thesis.