Local Likelihood
Abstract
Methods for probability density estimation are traditionally classified as either parametric or non-parametric. Fitting a parametric model to observations is generally a good idea when we have sufficient information on the origin of our data; if not, we must turn to non-parametric methods, usually at the cost of poorer performance. This thesis discusses local maximum likelihood estimation of probability density functions, which can be regarded as a compromise between the two mindsets. The idea is to fit a parametric model locally, that is, to let the parameters and their estimates depend on the location. If the chosen model is close to the true, unknown density, we keep much of the appealing properties of a full parametric approach. On the other hand, local likelihood density estimates have performance comparable to well known non-parametric methods, even though the locally fitted parametric model differs from the true density in a global sense. Although traditional methods withstand the test of time as excellent options in many situations, the local maximum likelihood estimator opens up a range of applications. Hjort and Jones [1996], who will serve as the main reference for this thesis, call it semi-parametric density estimation, as it is particularly useful when we have partial knowledge on the shape of the unknown density, but not enough to trust the ordinary, global maximum likelihood estimates. Further, many have built on the idea of locally parametric estimation to applications beyond just density estimation, some of whom have been mentioned and included as references throughout the thesis. One-dimensional density estimation will, however, be the primary focus here, with particular emphasis on large sample theory. The main results concern asymptotic bias, which is shown to have a larger order than the bias of traditional kernel estimation as the sample size increases to infinity, and the bandwidth decreases towards zero. Nonetheless, in practical situations with reasonable sample sizes, the local likelihood estimator is shown to perform very well, with an appealing robustness against under- and oversmoothing. Indeed, no experiment performed show signs of deterioration of local likelihood estimates compared to kernel estimation as the sample size grows.