Mining for individual patient outcome prediction in hip arthroplasty registry data
MetadataShow full item record
- Master theses 
The research of this thesis is concerned with developing and evaluating individual patient outcome prediction models based on hip arthroplasty registry data. It was assumed arthroplasty had a rich data collection to be explored using data mining methods. This was conducted in two major phases, firstly exploratory data analysis and then predictive modelling made possible by the finding of the exploration phase. To explore the dataset, clustering was utilized to identify similarities and distinctions between groups of patient records. Resulting from the exploration were the engineering and selection of dependent features to realize the predictive modelling. The dependent features were used for three separate perspective on modelling a patient outcome grounded in the length of survival of a prosthetic device. These perspectives were two classification tasks with a binary outcome and a multinomial outcome, as well as a prediction of survival as a continuous outcome. The classification tasks attempted to classify patients within categories defined by length of device survival, i.e. above and below eight years, as well as below five, between five and ten, and above ten years. Three separate learning algorithms from Scikit-learn were used to examine predictive capabilities in the dataset, and to compare performances. The best performance was observed in the Multi-layered perceptron classifier on the binary classification task. The other two algorithms performed comparatively well in binary classification (Logistic regression and Random forest classifier). None of the models produced reliable results in multinomial classification and in predicting exact survival year. Results suggest that there was not enough explanatory power in the independent variables to perform more complicated predictions.