ParaMonte Fortran 2.0.0
Parallel Monte Carlo and Machine Learning Library
See the latest version documentation.
pm_sampleECDF Module Reference

This module contains classes and procedures for computing the Empirical Cumulative Distribution Function (ECDF) of an observational sample and the associated the various properties. More...

Data Types

interface  setECDF
 Compute and return the Empirical Cumulative Distribution Function (ECDF) of a univariate (optionally weighted) sample of size size(ecdf). More...
 

Variables

character(*, SK), parameter MODULE_NAME = "@pm_sampleECDF"
 

Detailed Description

This module contains classes and procedures for computing the Empirical Cumulative Distribution Function (ECDF) of an observational sample and the associated the various properties.

An empirical Cumulative Distribution Function (eCDF) is the distribution function associated with the empirical measure of a sample.
This cumulative distribution function is a step function that jumps up by \(1 / N\) at each of the \(N\) data points.
Its value at any specified value of the measured variable is the fraction of observations of the measured variable that are less than or equal to the specified value.
The empirical distribution function is an estimate of the cumulative distribution function that generated the points in the sample.
It converges with probability 1 to that underlying distribution, according to the Glivenko–Cantelli theorem.
A number of results exist to quantify the rate of convergence of the empirical distribution function to the underlying cumulative distribution function.

Definition

Let \((X_1, \ldots, X_N)\) be independent, identically distributed real random variables with the common cumulative distribution function \(F(t)\).
Then the empirical distribution function is defined as,

\begin{equation} {\widehat{F}}_{N}(t) = \frac{{\mbox{number of elements in the sample}}\leq t}{N} = {\frac{1}{N}} \sum_{i = 1}^{N} \mathbf{1}_{X_{i}\leq t} ~, \end{equation}

where \({\mathbf{1}}_{{A}}\) is the indicator of event \(A\).
For a fixed \(t\), the indicator \(\mathbf{1}_{X_{i}\leq t}\) is a Bernoulli random variable with parameter \(p = F(t)\).
Hence, \(N{\widehat{F}}_{N}(t)\) is a binomial random variable with mean \(N\times F(t)\) and variance \(N\times F(t)(1 − F(t))\).
This implies that \({\widehat{F}}_{N}(t)\) is an unbiased estimator for \(F(t)\).

See also
pm_sampling
pm_sampleACT
pm_sampleCCF
pm_sampleCor
pm_sampleCov
pm_sampleConv
pm_sampleECDF
pm_sampleMean
pm_sampleNorm
pm_sampleQuan
pm_sampleScale
pm_sampleShift
pm_sampleWeight
pm_sampleAffinity
pm_sampleVar
Empirical distribution function
Test:
test_pm_sampleECDF


Final Remarks


If you believe this algorithm or its documentation can be improved, we appreciate your contribution and help to edit this page's documentation and source file on GitHub.
For details on the naming abbreviations, see this page.
For details on the naming conventions, see this page.
This software is distributed under the MIT license with additional terms outlined below.

  1. If you use any parts or concepts from this library to any extent, please acknowledge the usage by citing the relevant publications of the ParaMonte library.
  2. If you regenerate any parts/ideas from this library in a programming environment other than those currently supported by this ParaMonte library (i.e., other than C, C++, Fortran, MATLAB, Python, R), please also ask the end users to cite this original ParaMonte library.

This software is available to the public under a highly permissive license.
Help us justify its continued development and maintenance by acknowledging its benefit to society, distributing it, and contributing to it.

Author:
Fatemeh Bagheri, Thursday 12:45 AM, August 20, 2021, Dallas, TX

Variable Documentation

◆ MODULE_NAME

character(*, SK), parameter pm_sampleECDF::MODULE_NAME = "@pm_sampleECDF"

Definition at line 75 of file pm_sampleECDF.F90.