Machine learning vs logistic regression in credit scoring: A trade-off between accuracy and interpretability?

Hovdenakk, Arne Hesjedal

dc.contributor.author	Hovdenakk, Arne Hesjedal
dc.date.accessioned	2021-06-30T23:56:26Z
dc.date.available	2021-06-30T23:56:26Z
dc.date.issued	2021-06-15
dc.date.submitted	2021-06-30T22:00:52Z
dc.identifier.uri	https://hdl.handle.net/11250/2762661
dc.description.abstract	In this thesis, I compare logistic regression to the machine learning models k-nearest neighbor, decision trees, random forest, and gradient booster by creating different credit models. By using data from an anonymous Norwegian bank for consumer loan borrowers, I compare the models when continuous variables are split into intervals by using weight of evidence, and when they are kept in their raw form. By using Area under Receiver Operating Characteristic (AUROC) and Brier score as performance measures, I find that logistic regression and gradient booster are the most accurate models for this dataset, and logistic regression is recommended because of its interpretability.
dc.language.iso	eng
dc.publisher	The University of Bergen
dc.rights	Copyright the Author. All rights reserved
dc.subject	credit scoring
dc.subject	logistic regression.
dc.subject	Machine learning
dc.title	Machine learning vs logistic regression in credit scoring: A trade-off between accuracy and interpretability?
dc.type	Master thesis
dc.date.updated	2021-06-30T22:00:52Z
dc.rights.holder	Copyright the Author. All rights reserved
dc.description.degree	Masteroppgave
dc.description.localcode	ECON391
dc.description.localcode	MASV-SØK
dc.description.localcode	PROF-SØK
dc.subject.nus	734103
fs.subjectcode	ECON391
fs.unitcode	15-15-0

Tilhørende fil(er)

Filnavn:: Master-thesis-June-2021.pdf
Størrelse:: 974.1Kb
Format:: PDF
Beskrivelse:: master thesis

Åpne

Denne innførselen finnes i følgende samling(er)

Master theses [119]

Vis enkel innførsel