Natural Language Processing of fiscal yearly reports for use in risk assessment

Tenmann, Magnus

Tenmann, Magnus

Master thesis

View/Open

master thesis (6.946Mb)

URI

https://hdl.handle.net/1956/21343

Date

2019-12-09

Metadata

Show full item record

Collections

Master theses [203]

Abstract

The stability and accuracy of products in the financial sector is maintained by various measures within each organisation in the field they operate in. After a meeting with DNB Livsforsikring, which offers insurance products, it was identified that the current processes of risk assessment applied in this context could benefit from the language processing technologies. Consequently, this could lead to profit optimization for the company and decreased costs of human labour, and potentially in reduction of error, depending on the accuracy of the implemented technology. This research is conducted in cooperation with DNB with an aim of developing an application, which utilises the functionalities of existing libraries for Natural Language Processing (NLP) to perform the task of text extraction and topic modelling of the fiscal reports, provided by DNB. Design science research has been used to create an artifact that use text extraction for analytics of fiscal yearly reports. Other textual visualisations are implemented, such as word clouds and Latent Dirichlet Allocation (LDA). The implementation utilizes a variety of technologies, including the NLTK library, as well as other common data science libraries, such as sci-kit learn. The main functionalities of the resulting artifact are text extraction and visualisation of topic modelling, TF-IDF, wordcloud generation and frequency distribution of which were fully functional as separate components. As part of the development process, a number of subject-specific methods have been used and implemented, such as agile development and minimum viable product. The evaluation of the prototype has shown perceived usefulness, relevance to the intended application, understandability, practicality and the ability to produce some relevant results.

Publisher

The University of Bergen