A Dimensionality Reducing Extension of Bayesian Relevance Learning

Heimsæter, Sandra Vervik

Heimsæter, Sandra Vervik

Master thesis

Åpne

Master Thesis (1.592Mb)

Permanent lenke

https://hdl.handle.net/11250/2737181

Utgivelsesdato

2021-02-11

Metadata

Vis full innførsel

Samlinger

Master theses [120]

Sammendrag

When modeling with big data and high dimensional data, the ability to extract the most important information from the data set and avoid overfitting is crucial. However, by using well developed sparse methods, we can construct models that are less likely to overfit as they use only the most informative part of the data. In this thesis, we are developing an algorithm which can simultaneously achieve sample and feature selection when facing big data in supervised learning. This parametric Bayesian regression learning method is based on a well known Bayesian sparse learning method: the Relevance Vector Machine (RVM). The deduction of the algorithm is inspired by, the probabilistic feature selection and classification vector machine (PFCVM), which is a simultaneous sample and feature selective extension of the RVM classification model. Our resulting method is called the dimensionality reducing relevance vector machine (DRVM), and it performs simultaneous feature and sample selection in the regression case. The proposed model is sparse in terms of choosing only the most important features and samples to explain the input data, as well as being accurate in predictions.