ParaMonte Fortran 2.0.0
Parallel Monte Carlo and Machine Learning Library
See the latest version documentation.
pm_distanceKolm Module Reference

This module contains classes and procedures for computing the Kolmogorov statistical distance. More...

Data Types

interface  getDisKolm
 Generate and return the Kolmogorov distance of a sample1 of size nsam1 from another sample sample2 of size nsam2 or the CDF of the Uniform or a custom reference distribution.
More...
 
interface  setDisKolm
 Return the Kolmogorov distance of a sample1 of size nsam1 from another sample sample2 of size nsam2 or the CDF of the Uniform or a custom reference distribution.
More...
 

Variables

character(*, SK), parameter MODULE_NAME = "@pm_distanceKolm"
 

Detailed Description

This module contains classes and procedures for computing the Kolmogorov statistical distance.

The Kolmogorov distance of a univariate observational sample from another univariate observational sample is the largest separation between the Empirical Distribution Functions of the two samples.
Formally, the empirical distribution function \(F_n\) for \(n\) independent and identically distributed (i.i.d.) ordered observations \(X_i\) is defined as,

\begin{equation} F_{n}(x) = {\frac{{\text{number of (elements in the sample}} \leq x)}{n}} = {\frac{1}{n}} \sum_{i=1}^{n}1_{(-\infty ,x]}(X_{i}) ~, \end{equation}

where \(1_{(-\infty ,x]}(X_{i})\) is the indicator function, equal to \(1\) if \(X_{i}\leq x\) and equal to \(0\) otherwise.
The Kolmogorov–Smirnov distance (or statistic) for a given cumulative distribution function \(F(x)\) is,

\begin{equation} D_{n} = \sup_{x}|F_{n}(x) - F(x)| ~, \end{equation}

where \(\sup_x\) is the supremum of the set of distances.
Intuitively, the statistic takes the largest absolute difference between the two distribution functions across all \(x\) values.
By the Glivenko–Cantelli theorem, if the sample comes from distribution \(F(x)\), then \(D_n\) converges to \(0\) almost surely in the limit when \(n\) goes to infinity.
Kolmogorov strengthened this result, by effectively providing the rate of this convergence through the definition of the Kolmogorov distribution.

See also
pm_distKolm
Test:
test_pm_distanceKolm


Final Remarks


If you believe this algorithm or its documentation can be improved, we appreciate your contribution and help to edit this page's documentation and source file on GitHub.
For details on the naming abbreviations, see this page.
For details on the naming conventions, see this page.
This software is distributed under the MIT license with additional terms outlined below.

  1. If you use any parts or concepts from this library to any extent, please acknowledge the usage by citing the relevant publications of the ParaMonte library.
  2. If you regenerate any parts/ideas from this library in a programming environment other than those currently supported by this ParaMonte library (i.e., other than C, C++, Fortran, MATLAB, Python, R), please also ask the end users to cite this original ParaMonte library.

This software is available to the public under a highly permissive license.
Help us justify its continued development and maintenance by acknowledging its benefit to society, distributing it, and contributing to it.

Author:
Amir Shahmoradi, March 22, 2012, 2:21 PM, National Institute for Fusion Studies, The University of Texas at Austin

Variable Documentation

◆ MODULE_NAME

character(*, SK), parameter pm_distanceKolm::MODULE_NAME = "@pm_distanceKolm"

Definition at line 61 of file pm_distanceKolm.F90.