Automated Moderation: Detecting Irony in a Norwegian Facebook Comment Section using a Longformer Transformer Model with a Context Encoded Dataset

Hatlebakk, Torstein

Hatlebakk, Torstein

Master thesis

Åpne

master thesis (1.187Mb)

vedlegg (19.97Kb)

Permanent lenke

https://hdl.handle.net/11250/3001953

Utgivelsesdato

2022-06-01

Metadata

Vis full innførsel

Samlinger

Master theses [247]

Sammendrag

Irony is a complex phenomenon of human communication and due to its contextual nature has been notoriously difficult for machine learning algorithms to detect. With an established practical definition of irony based in the environment of Facebook comment sections. Used together with a Norwegian language pre-trained BERT model converted to a long version that supports longer text inputs, and a Norwegian Facebook comment dataset with contextual article and reply comment text included. It was found that the long BERT model trained on the context included inputs dataset outperformed the short BERT models trained on datasets of the same and more comments, but without the contextual information encoded.

Utgiver

The University of Bergen