Speech-to-text models to transcribe emergency calls

Thuestad, Jens Andreas; Grutle, Øyvind

dc.contributor.author	Thuestad, Jens Andreas
dc.contributor.author	Grutle, Øyvind
dc.date.accessioned	2023-08-10T06:13:23Z
dc.date.available	2023-08-10T06:13:23Z
dc.date.issued	2023-06-01
dc.date.submitted	2023-08-08T22:00:25Z
dc.identifier.uri	https://hdl.handle.net/11250/3083251
dc.description.abstract	This thesis is part of the larger project “AI-Support in Medical Emergency Calls (AISMEC)”, which aims to develop a decision support system for Emergency Medical Communication Center (EMCC) operators to better identify and respond to acute brain stroke. The system will utilize historical health data and the transcription from the emergency call to assist the EMCC operator in whether or not to dispatch an ambulance and with what priority and urgency. Our research primarily focuses on adapting the Automatic Speech Recognition (ASR) model, Whisper, to create a robust and accurate ASR model to transcribe Norwegian emergency calls. The model was fine-tuned on simulated emergency calls and recordings done by ourselves. Furthermore, a proof-of-concept ASR web application was developed with the goal of streamlining the manual task of transcribing emergency calls. After demonstrating the application to the involved researchers in AISMEC, and the potential users, both suggested optimism about the potential of this solution to streamline the transcription process. As part of our research, we conducted an experiment where we utilized the suggested transcriptions provided by the application and then corrected them for accuracy. This approach showed a notable reduction in our transcription time. We also found that establishing a machine learning pipeline to fine-tune the model on historical emergency calls was feasible. Further work would involve training the model on actual emergency calls. To investigate the efficiency of the ASR web application further, a larger scale of the semi-automatic transcription experiment could be conducted by the professional audio transcribers at Haukeland universitetssjukehus.
dc.language.iso	eng
dc.publisher	The University of Bergen
dc.rights	Copyright the Author. All rights reserved
dc.title	Speech-to-text models to transcribe emergency calls
dc.type	Master thesis
dc.date.updated	2023-08-08T22:00:25Z
dc.rights.holder	Copyright the Author. All rights reserved
dc.description.degree	Master's Thesis in Joint Master's Programme in Software Engineering - collaboration with HVL
dc.description.localcode	PROG399
dc.description.localcode	MAMN-PROG
dc.subject.nus	754199
fs.subjectcode	PROG399
fs.unitcode	12-12-0

Tilhørende fil(er)

Filnavn:: Grutle_Thuestad_HVL_MSc_2023.pdf
Størrelse:: 995.0Kb
Format:: PDF
Beskrivelse:: master thesis

Åpne

Denne innførselen finnes i følgende samling(er)

Department of Informatics [928]

Vis enkel innførsel