UncertainData.jl: a Julia package for working with measurements and datasets with uncertainties
Journal article, Peer reviewed
Published version
Åpne
Permanent lenke
https://hdl.handle.net/11250/2728330Utgivelsesdato
2019-11-01Metadata
Vis full innførselSamlinger
- Department of Earth Science [1117]
- Registrations from Cristin [10773]
Sammendrag
UncertainData.jl provides an interface to represent data with associated uncertainties for
the Julia programming language (Bezanson, Edelman, Karpinski, & Shah, 2017). Unlike
Measurements.jl (Giordano, 2016), which deals with exact error propagation of normally
distributed values, UncertainData.jl uses a resampling approach to deal with uncertainties
in calculations. This allows working with and combining any type of uncertain value for which
a resampling method can be defined. Examples of currently supported uncertain values are:
theoretical distributions, e.g., those supported by Distributions.jl (Besançon et al., 2019; Lin
et al., 2019); values whose states are represented by a finite set of values with weighted
probabilities; values represented by empirical distributions; and more.
The package simplifies resampling from uncertain datasets whose data points potentially have
different kinds of uncertainties, both in data values and potential index values (e.g., time or
space). The user may resample using a set of pre-defined constraints, truncating the supports
of the distributions furnishing the uncertain datasets, combined with interpolation on predefined grids. Methods for sequential resampling of ordered datasets that have indices with
uncertainties are also provided.
Using Julia’s multiple dispatch, UncertainData.jl extends most elementary mathematical
operations, hypothesis tests from HypothesisTests.jl, and various methods from the StatsBase.jl package for uncertain values and uncertain datasets. Additional statistical algorithms in other packages are trivially adapted to handle uncertain values and datasets from
UncertainData.jl by using multiple dispatch and the provided resampling framework.
UncertainData.jl was originally designed to form the backbone of the uncertainty handling
in the CausalityTools.jl package, with the aim of quantifying the sensitivity of statistical time
series causality detection algorithms. Recently, the package has also been used in paleoclimate
research (Vasskog et al., 2019).