dc.contributor.author | Lyse, Gunn Inger | |
dc.contributor.author | Andersen, Gisle | |
dc.date.accessioned | 2016-02-04T10:28:45Z | |
dc.date.available | 2016-02-04T10:28:45Z | |
dc.date.issued | 2012 | |
dc.Published | Lyse, Gunn Inger & Gisle Andersen. 2012. Collocations and statistical analysis of n-grams: Multiword expressions in newspaper text. In Andersen, Gisle (ed.) Exploring Newspaper Language. Amsterdam/New York: John Benjamins, 79-109. | eng |
dc.identifier.isbn | 978-90-272-0354-0 | |
dc.identifier.issn | 1388-0373 | |
dc.identifier.uri | https://hdl.handle.net/1956/11033 | |
dc.description.abstract | Multiword expressions (MWEs) are words that co-occur so often that they are perceived as a linguistic unit. Since MWEs pervade natural language, their identification is pertinent for a range of tasks within lexicography, terminology and language technology. We apply various statistical association measures (AMs) to word sequences from the Norwegian Newspaper Corpus (NNC) in order to rank two-and three-word sequences (bigrams and trigrams) in terms of their tendency to co-occur. The results show that some statistical measures favour relatively frequent MWEs (e.g. i motsetning til ‘as opposed to’), whereas other measures favour relatively low-frequent units, which typically comprise loan words (de facto), technical terms (notaries publicus) and phrasal anglicisms (practical jokes; cf. G. Andersen this volume). On this basis we evaluate the relevance of each of these measures for lexicography, terminology and language technology purposes. | en_US |
dc.language.iso | eng | eng |
dc.publisher | John Benjamins Publishing Company | eng |
dc.relation.ispartofseries | Studies in Corpus Linguistics | eng |
dc.title | Collocations and statistical analysis of n-grams: Multiword expressions in newspaper text | eng |
dc.type | Chapter | |
dc.type | Peer reviewed | |
dc.description.version | acceptedVersion | |
dc.rights.holder | Copyright John Benjamins Publishing Company. All rights reserved | eng |
dc.identifier.doi | https://doi.org/10.1075/scl.49.05lys | |
dc.identifier.cristin | 922217 | |
dc.source.40 | 49 | |
dc.source.pagenumber | 79-109 | |