The enrichment of lexical resources through incremental parsebanking
Rosén, Victoria; Thunes, Martha; Haugereid, Petter; Losnegaard, Gyri Smørdal; Dyvik, Helge J. Jakhelln; Meurer, Paul; Samdal, Gunn Inger Lyse; De Smedt, Koenraad
Peer reviewed, Journal article
Published version
View/ Open
Date
2016-06Metadata
Show full item recordCollections
Original version
https://doi.org/10.1007/s10579-016-9356-5Abstract
Automatic syntactic analysis of a corpus requires detailed lexical and morphological information that cannot always be harvested from traditional dictionaries. Therefore the development of a treebank presents an opportunity to simultaneously enrich the lexicon. In building NorGramBank, we use an incremental parsebanking approach, in which a corpus is parsed and disambiguated, and after improvements to the grammar and the lexicon, reparsed. In this context we have implemented a text preprocessing interface where annotators can enter unknown words or missing lexical information either before parsing or during disambiguation. The information added to the lexicon in this way may be of great interest both to lexicographers and to other language technology efforts.