## Quality of the Analysis Step in EnKF

##### Abstract

In many physical applications we want to characterize the parameters of a system based on indirect observations or measurements.
In a reservoir simulator setting, the goal is to simulate the production of hydrocarbons from the reservoir. This way we can try out different production strategies and optimize the production plan before the reservoir is put on production. These decisions depend on good simulations of the flow of oil, gas and water in the porous rocks.
To achieve appropriate flow calculations, a good estimate of the flow properties of the rock is needed. The process of building an approximation to the reservoir itself and its properties is called reservoir modeling or reservoir characterization. For this, prior information is used, like well logs, analyzed core plugs from the appraisal wells and seismic data. This information gives us some estimate of our poorly known reservoir parameters, like the porosity and permeability fields.The performance of the reservoir, given a recovery strategy, can be predicted by a reservoir simulator. After the field is put on production one may use the production data to improve the reservoir model. The basic idea is that predicted performance should match the observed performance. By tuning the parameters in the model, one tries to fit the output of the simulator to the production history. This is referred to as history matching, which is a nonlinear inverse problem.
A promising method to automatically perform the history matching is the Ensemble Kalman Filter. EnKF is a sequential data assimilation algorithm using Monte Carlo techniques where measurements and prior information about the system is combined to make the best weighted estimate based on their uncertainties. After the assimilation, the model is run forward in time using the reservoir simulator. When new observations or data are available, the next analysis step will incorporate the new observations to produce a new analyzed estimate.
A large number of data assimilated at the same time has proved to be a difficult challenge for EnKF. This could correspond to the use of e.g. 4D seismic data. One computational advantage is that the covariance matrix of the system is never explicitly calculated, but rather approximated from the ensemble itself. However, spurious correlations in the ensemble sample covariance matrix is one problem to be addressed. In particular, properties in cells far away from the location of measurements are affected in too great scale. EnKF is based on the Kalman Filter, which is a recursive filter for linear problems.
In this master thesis we consider the quality of the analysis step of the EnKF. Our main focus is the sampling errors caused by the approximated sample covariance matrix when a increasing number of measurements are assimilated.The work here is inspired by [KovalenkoSamplingError2011, KovalenkoECMOR], where a probabilistic measure for the sampling error is derived under the assumptions of a normally distributed prior and negligible measurement errors.Here we try a somewhat different approach using approximate calculations and Neumann series to asses the sampling error. We consider measurement errors of varying size.