Multivariate and conditional density estimation using local Gaussian approximations
Abstract
Paper 1 ”Bias and bandwidth for local likelihood density estimation”: A local likelihood density estimator is shown to have asymptotic bias depending on the dimension of the local parameterization. Comparing with kernel estimation it is demonstrated using a variety of bandwidths that we may obtain as good, and potentially even better estimates using local likelihood. Boundary effects are also examined. Paper 2 ”The locally Gaussian density estimator for multivariate data”: It is well known that the Curse of Dimensionality causes the standard Kernel Density Estimator to break down quickly as the number of variables increases. In non-parametric regression, this effect is relieved in various ways, for example by assuming additivity or some other simplifying structure on the interaction between variables. This paper presents the Locally Gaussian Density Estimator (LGDE), which introduces a similar idea to the problem of density estimation. The LGDE is a new method for the non-parametric estimation of multivariate probability density functions. It is based on preliminary transformations of the marginal observation vectors towards standard normality, and a simplified local likelihood fit of the resulting distribution with standard normal marginals. The LGDE is introduced, and asymptotic theory is derived. In particular, it is shown that the LGDE converges at a speed that does not depend on the dimension. Examples using real and simulated data confirm that the new estimator performs very well on finite sample sizes. Paper 3 ”Non-parametric estimation of conditional density functions: A new method”: Let X = (X1, . . . , Xp) be a stochastic vector having joint density function fX(x) with partitions X1 = (X1, . . . , Xk) and X2 = (Xk+1, . . . , Xp). A new method for estimating the conditional density function of X1 given X2 is presented. It is based on locally Gaussian approximations, but simplified in order to tackle the curse of dimensionality in multivariate applications, where both response and explanatory variables can be vectors. We compare our method to some available competitors, and the error of approximation is shown to be small in a series of examples using real and simulated data, and the estimator is shown to be particularly robust against noise caused by independent variables. We also present examples of practical applications of our conditional density estmator in the analysis of time series. Typical values for k in our examples are 1 and 2, and we include simulation experiments with values of p up to 6. Large sample theory is established under a strong mixing condition.
Has parts
Paper 1: Håkon Otneim, Hans Arnfinn Karlsen, and Dag Tjøstheim. "Bias and bandwidth for local likelihood density estimation." Statistics & Probability Letters 83.5 (2013): 1382-1387. The article is available in the thesis. The published version is also available at: http://dx.doi.org/10.1016/j.spl.2013.02.003Paper 2: Håkon Otneim and Dag Tjøstheim. "The locally Gaussian density estimator for multivariate data." Statistics & Computing, 27,6 (2017): 1595-1616. Submitted version is available in the thesis. The published version is available at: http://dx.doi.org/10.1007/s11222-016-9706-6
Paper 3: Håkon Otneim and Dag Tjøstheim. "Non-parametric estimation of conditional densities: A new method."