• norsk
    • English
  • English 
    • norsk
    • English
  • Login
View Item 
  •   Home
  • Faculty of Humanities
  • Department of Linguistics, Literary and Aestetic Studies
  • Department of Linguistics, Literary and Aestetic Studies
  • View Item
  •   Home
  • Faculty of Humanities
  • Department of Linguistics, Literary and Aestetic Studies
  • Department of Linguistics, Literary and Aestetic Studies
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Semantics driven anaphora resolution

Skaugen, Håvar
Master thesis
Thumbnail
View/Open
141259970.pdf (693.5Kb)
URI
https://hdl.handle.net/1956/10867
Date
2015-11-20
Metadata
Show full item record
Collections
  • Department of Linguistics, Literary and Aestetic Studies [809]
Abstract
This thesis describes a method for generating semantically motivated antecedent candidates for use in pronominal anaphora resolution. Predicate-argument structures are extracted from a large corpus of text parsed by the NorGram grammar and used as the basis for a fuzzy classification model. Given a pronominal anaphor, the model generates antecedent candidates ranked by the frequency by which they co-occur in the same lexical context as the anaphor. This set of candidates is intersected with the set of nouns gathered from the anaphor's recent context. A selection basic heuristics are then introduced to the model in a permutational fashion to gauge their individual and combined effect on the model's accuracy. The model reached an accuracy of 56.22% correct predictions. Additionally, in a slightly modified model the correct antecedent was found within the antecedent candidate list for 87.12% of the anaphora.
 
I denne oppgaven beskriver jeg en metode for å generere semantisk motiverte antesedentkandidater til bruk i anaforoppløsning. Predikat-argument strukturer blir ekstrahert fra et stort korpus med tekst tagget med NorGram-grammatikken og brukt som basis i en "fuzzy" klassifikasjonsmodell. Modellen genererer antesedentkandidater for pronominelle anaforer rangert etter hvilken frekvens de forekommer i samme leksikale kontekst som anaforen. Et snitt blir foretatt mellom dette settet av kandidater og settet av substantiver i anaforens foregående kontekst. Et utvalg enkle heuristikker blir tilført modellen i forskjellige permutasjoner for å måle deres samlede og individuelle effekt på modellens treffsikkerhet. Modellen nådde en treffsikkerhet på 56.22% korrekte klassifiserte antesedenter. For en delvis modifisert versjon av modellen finnes den korrekte antesedenten blant antesedentkandidatene i 87.12% av tilfellene.
 
Publisher
The University of Bergen
Copyright
Copyright the author. All rights reserved

Contact Us | Send Feedback

Privacy policy
DSpace software copyright © 2002-2019  DuraSpace

Service from  Unit
 

 

Browse

ArchiveCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsDocument TypesJournalsThis CollectionBy Issue DateAuthorsTitlesSubjectsDocument TypesJournals

My Account

Login

Statistics

View Usage Statistics

Contact Us | Send Feedback

Privacy policy
DSpace software copyright © 2002-2019  DuraSpace

Service from  Unit