Vis enkel innførsel

dc.contributor.authorVestby, Magnus Lyngseth
dc.date.accessioned2020-07-20T04:16:06Z
dc.date.available2020-07-20T04:16:06Z
dc.date.issued2020-07-11
dc.date.submitted2020-07-10T22:03:02Z
dc.identifier.urihttps://hdl.handle.net/1956/23346
dc.descriptionRevised version: some spelling - and formatting errors corrected.
dc.description.abstractThis thesis analyses how level-of-living survey data can be explored using data mining techniques and how well the resulting patterns can be visualized to inform non-experts. The project utilized the design science research framework for the project structure and methodology, and the the knowledge discovery in databases(KDD) methodology for developing the models and visualizations. To answer the research questions several machine learning methods were tested on a data set with selected variables describing education, disability, health, age, and marital status over a period of 50 years (1973-2017) . Scikit-learn was used to employ the machine learning models. Ridge regression was found to be optimal model for the goals of this thesis. The patterns found by the Ridge regression were visualized in graphs and bar charts. The visualizations were then evaluated using semi-structured interviews, tasks, and a visualizations usability scale. The results show that visualizations based on the patterns found during data mining, were informative and interesting to the participants in the evaluation. The visualizations scored highly on the visualizations usability scale, with an average score of 87.5. This meant that the group had little to no problems interpreting the graphs and figures. The participants were surprised by some of discovered patterns regarding inequalities related to gender and level of education. It shows that interesting patterns in the Norwegian level of living surveys can be found with the use data mining techniques. It also shows that these patterns can be visualized so that non-experts can retrieve information. This thesis represents a proof by construction. It shows that patterns in the Norwegian level of living surveys can be found with the use of data mining techniques. The model developed here can be reused for similar projects and data mining tasks, but future developers need to pay attention to all steps of the KDD-process including the data cleaning. A proper user interface should be designed to help different kind of user groups.en_US
dc.language.isoeng
dc.publisherThe University of Bergen
dc.rightsCopyright the Author. All rights reserved
dc.subjectlevel of living surveys
dc.subjectdesign science
dc.subjectdata mining
dc.subjectVUS
dc.subjectvisualizations
dc.subjectpatterns
dc.subjectknowledge discovery in databases
dc.subjectmachine learning
dc.subjectRidge regression
dc.subjectvisualization usability scale
dc.subjectkdd
dc.titleData Mining in Norwegian Level-of-Living Survey Data
dc.typeMaster thesisen_US
dc.date.updated2020-07-10T22:03:02Z
dc.rights.holderCopyright the Author. All rights reserveden_US
dc.description.degreeMasteroppgave i informasjonsvitenskap
dc.description.localcodeINFO390
dc.description.localcodeMASV-INFO
dc.description.localcodeMASV-IKT
dc.subject.nus735115
fs.subjectcodeINFO390
fs.unitcode15-17-0


Tilhørende fil(er)

Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel