Non-domain specific and translated sentiment analysis - Local lexical analyzer

Olsen, Erik Kringstad

Olsen, Erik Kringstad

Master thesis

Published version

Åpne

135270779.pdf (1.715Mb)

Permanent lenke

http://hdl.handle.net/1956/10222

Utgivelsesdato

2015-06-01

Metadata

Vis full innførsel

Samlinger

Department of Information Science and Media Studies [847]

Sammendrag

There is an ever-growing amount of opinionated data available on the Web, in form of reviews, discussions and blogs. This data can potentially provide a lot of information through sentiment analysis and data mining in general. However, most research in the field of sentiment analysis has been locked to a single language and a single domain. Thus, the main objective of this thesis is to answer the question: How can a sentiment analysis tool that is independent of domain and language be developed?" The aim of this thesis is to show the possibilities of a sentiment analysis program independent of languages and domain, capable of being used in different languages and domains without added effort. A sentiment analysis program was developed to test the feasibility of sentiment analysis across different domains or in languages other than English. Three main methods of sentiment analysis were implemented into the program, with tests being run on three different datasets. These three methods are a single weight analysis method where a documents sentiment equals the sum of its words sentiment, a sentence analyzer where a document is analyzed sentence by sentence, and a co- occurrence analyzer operating on the assumption that words occurring together often share the same sentiment value. The results shown are not able to achieve the same quality as those shown in other published articles, and in the case of the non-English analysis, are inconclusive. However the possible viability of a completely resourceless sentiment analysis program is shown. Further research and improvement is needed to achieve better results.

Utgiver

The University of Bergen