Inferring Gene Expression Values In Causal Directed Acyclic Graphs Using Graph Neural Networks

Solevåg, Bendik Akselsen

Solevåg, Bendik Akselsen

Master thesis

Åpne

master thesis (3.487Mb)

Permanent lenke

https://hdl.handle.net/11250/3126366

Utgivelsesdato

2023-08-21

Metadata

Vis full innførsel

Samlinger

Master theses [201]

Sammendrag

Inferring gene expression values is helpful in determining important characteristics about an individual. Existing methods in gene expression inference mostly rely on linear meth- ods creating separate models for each gene. This thesis hypothesises that a graph neural network can be used to model interactions between genes, and serve as a universal ap- proximator for gene expression values. The research goals of this thesis are stated in the following four points. 1. Does prediction accuracy improve when also providing genome variation data in the dataset? 2. Can the graph feature autoencoder architecture be applied to predict missing gene expression values in a masked dataset? 3. Can a graph neural network be applied to predict missing gene expression values in a masked sample? 4. Can dataset gene expression values be extrapolated using only genome variation data? An experiment was set up to answer this list of questions. The results indicate that prediction accuracy does improve when providing genome data, and the graph feature au- toencoder architecture was applied successfully. This thesis was not able to create a graph neural network able to predict gene expression values in a masked sample. This thesis was not able to reliably extrapolate gene expression data using only genome variation data.

Utgiver

The University of Bergen