Vis enkel innførsel

dc.contributor.authorPaulsen, Mikkel Ekeland
dc.date.accessioned2023-01-26T08:27:25Z
dc.date.available2023-01-26T08:27:25Z
dc.date.created2022-12-02T11:45:26Z
dc.date.issued2022
dc.identifier.issn1384-6655
dc.identifier.urihttps://hdl.handle.net/11250/3046458
dc.description.abstractThe article investigates the two main corpus indicators of word commonness, frequency and dispersion, through a cross-validation analysis of frequency and four dispersion measures (‘Range’, ‘Chi-squared’, ‘Deviation of Proportions’ and ‘Juilland’s D’). The approach provides an estimation of the capacity of the named measures to predict the distribution of corpus items in an extracted language sample. Based on a dataset of 273 Norwegian compounds, the results show that especially Deviation of Proportions is a robust measure of dispersion that can be used in conjunction with frequency to substantiate assertions of word commonness based on corpus data. In addition, dispersion measures do not only reflect what sort of distribution the frequency statistic is generated from, but also how reliable the frequency estimation in the corpus sample is in terms of giving an accurate representation of frequency in the language variety that the corpus is sampled from.en_US
dc.language.isoengen_US
dc.publisherJohn Benjamins Publishingen_US
dc.titleAssessing Word Commonness - Adding dispersion to frequencyen_US
dc.typeJournal articleen_US
dc.typePeer revieweden_US
dc.description.versionacceptedVersionen_US
dc.rights.holderCopyright John Benjamins Publishing Companyen_US
cristin.ispublishedtrue
cristin.fulltextpostprint
cristin.qualitycode2
dc.identifier.doi10.1075/ijcl.21037.eke
dc.identifier.cristin2087698
dc.source.journalInternational Journal of Corpus Linguisticsen_US
dc.identifier.citationInternational Journal of Corpus Linguistics. 2022.en_US


Tilhørende fil(er)

Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel