The enrichment of lexical resources through incremental parsebanking
Rosén, Victoria; Thunes, Martha; Haugereid, Petter; Losnegaard, Gyri Smørdal; Dyvik, Helge J. Jakhelln; Meurer, Paul; Samdal, Gunn Inger Lyse; De Smedt, Koenraad
Peer reviewed, Journal article
Published version
Åpne
Permanent lenke
https://hdl.handle.net/1956/15680Utgivelsesdato
2016-06Metadata
Vis full innførselSamlinger
Originalversjon
https://doi.org/10.1007/s10579-016-9356-5Sammendrag
Automatic syntactic analysis of a corpus requires detailed lexical and morphological information that cannot always be harvested from traditional dictionaries. Therefore the development of a treebank presents an opportunity to simultaneously enrich the lexicon. In building NorGramBank, we use an incremental parsebanking approach, in which a corpus is parsed and disambiguated, and after improvements to the grammar and the lexicon, reparsed. In this context we have implemented a text preprocessing interface where annotators can enter unknown words or missing lexical information either before parsing or during disambiguation. The information added to the lexicon in this way may be of great interest both to lexicographers and to other language technology efforts.