Making sense of nonsense : Integrated gradient-based input reduction to improve recall for check-worthy claim detection

Sheikhi, Ghazaal; Opdahl, Andreas Lothe; Touileb, Samia; Setty, Vinay

dc.contributor.author	Sheikhi, Ghazaal
dc.contributor.author	Opdahl, Andreas Lothe
dc.contributor.author	Touileb, Samia
dc.contributor.author	Setty, Vinay
dc.date.accessioned	2023-12-20T11:57:32Z
dc.date.available	2023-12-20T11:57:32Z
dc.date.created	2023-09-21T15:56:59Z
dc.date.issued	2023
dc.identifier.issn	1613-0073
dc.identifier.uri	https://hdl.handle.net/11250/3108380
dc.description.abstract	Analysing long text documents of political discourse to identify check-worthy claims (claim detection) is known to be an important task in automated fact-checking systems, as it saves the precious time of fact-checkers, allowing for more fact-checks. However, existing methods use black-box deep neural NLP models to detect check-worthy claims, which limits the understanding of the model and the mistakes they make. The aim of this study is therefore to leverage an explainable neural NLP method to improve the claim detection task. Specifically, we exploit well known integrated gradient-based input reduction on textCNN and BiLSTM to create two different reduced claim data sets from ClaimBuster. We observe that a higher recall in check-worthy claim detection is achieved on the data reduced by BiLSTM compared to the models trained on claims. This is an important remark since the cost of overlooking check-worthy claims is high in claim detection for fact-checking. This is also the case when a pre-trained BERT sequence classification model is fine-tuned on the reduced data set. We argue that removing superfluous tokens using explainable NLP could unlock the true potential of neural language models for claim detection, even though the reduced claims might make no sense to humans. Our findings provide insights on task formulation, design of annotation schema and data set preparation for check-worthy claim detection.	en_US
dc.language.iso	eng	en_US
dc.publisher	CEUR	en_US
dc.relation.ispartof	Proceedings of the 5th Symposium of the Norwegian AI Society (NAIS 2023)
dc.relation.uri	https://ceur-ws.org/Vol-3431/paper8.pdf
dc.rights	Navngivelse 4.0 Internasjonal	*
dc.rights.uri	http://creativecommons.org/licenses/by/4.0/deed.no	*
dc.title	Making sense of nonsense : Integrated gradient-based input reduction to improve recall for check-worthy claim detection	en_US
dc.type	Chapter	en_US
dc.description.version	publishedVersion	en_US
dc.rights.holder	Copyright 2023 The Author(s)	en_US
cristin.ispublished	true
cristin.fulltext	original
dc.identifier.cristin	2177670
dc.relation.project	Norges forskningsråd: 309339	en_US
dc.identifier.citation	In: NAIS 2023: The 2023 symposium of the Norwegian AI Society	en_US

Tilhørende fil(er)

Filnavn:: 2177670_OA_Setty.pdf
Størrelse:: 1.082Mb
Format:: PDF
Beskrivelse:: PDF

Åpne

Denne innførselen finnes i følgende samling(er)

Department of Information Science and Media Studies [853]
Registrations from Cristin [9673]

Vis enkel innførsel

Med mindre annet er angitt, så er denne innførselen lisensiert som Navngivelse 4.0 Internasjonal