ParaMonte Fortran 2.0.0
Parallel Monte Carlo and Machine Learning Library
See the latest version documentation.
pm_sampleMean Module Reference

This module contains classes and procedures for computing the first moment (i.e., the statistical mean) of random weighted samples. More...

Data Types

interface  getMean
 Generate and return the (weighted) mean of an input sample of nsam observations with ndim = 1 or 2 attributes, optionally weighted by the input weight.
More...
 
interface  getMeanMerged
 Generate and return the (weighted) merged mean of a sample resulting from the merger of two separate (weighted) samples \(A\) and \(B\).
More...
 
interface  setMean
 Return the (weighted) mean of a pair of time series or of an input sample of nsam observations with ndim = 1 or 2 attributes, optionally weighted by the input weight, optionally also sum(weight) and optionally, sum(weight**2).
More...
 
interface  setMeanMerged
 Return the (weighted) merged mean of a sample resulting from the merger of two separate (weighted) samples \(A\) and \(B\).
More...
 

Variables

character(*, SK), parameter MODULE_NAME = "@pm_sampleMean"
 

Detailed Description

This module contains classes and procedures for computing the first moment (i.e., the statistical mean) of random weighted samples.

The mean of a weighted sample of \(N\) data points is computed by the following equation,

\begin{equation} \mu = \frac{\sum_{i = 1}^{N} w_i x_i}{\sum_{i = 1}^{N} w_i} \end{equation}

where \(w_i\) represents the weight of the \(i\)th sample.

Mean updating

Suppose the mean of an initial (potentially weighted) sample \(x_A\) of size \(N_A\) is computed to be \(\mu_A\).
Another (potentially weighted) sample \(x_B\) of size \(N_B\) is subsequently obtained with a different number observations and mean \(\mu_B\).
The mean of the two samples combined can be expressed in terms of the originally computed means,

\begin{equation} \large \mu = \frac { w_A \sum_{i = 1}^{N_A} w_{\up{A,i}} x_{\up{A,i}} + w_B \sum_{i = 1}^{N_B} w_{\up{B,i}} x_{\up{B,i}} }{ w_A + w_B } \end{equation}

where \(\large w_A = \sum_{i = 1}^{N_A} w_{\up{A,i}}\) and \(\large w_B = \sum_{i = 1}^{N_B} w_{\up{B,i}}\) are sums of the weights of the corresponding samples.
For equally-weighted samples, the corresponding weights \(w_{\up{A,i}}\) or \(w_{\up{B,i}}\) or both are all unity such that \(N_A = w_A\) or \(N_B = w_B\) or both holds.

Developer Remark:
While it is tempting to extend the generic interfaces of this module to weight arguments of type integer or real of various kinds, such extensions do not add any benefits beyond making the interface more flexible for the end user.
But such extensions would certainly make the maintenance and future extensions of this interface difficult and complex.
According to the coercion rules of the Fortran standard, if an integer is multiplied with a real, the integer value must be first converted to real of the same kind as the real value, then multiplied.
Furthermore, the floating-point multiplication tends to be faster than integer multiplication on most modern architecture.
The following list compares the cost and latencies of some of basic operations involving integers and real numbers.
  1. Central Processing Unit (CPU):
    1. Integer add: 1 cycle
    2. 32-bit integer multiply: 10 cycles
    3. 64-bit integer multiply: 20 cycles
    4. 32-bit integer divide: 69 cycles
    5. 64-bit integer divide: 133 cycles
  2. On-chip Floating Point Unit (FPU):
    1. Floating point add: 4 cycles
    2. Floating point multiply: 7 cycles
    3. Double precision multiply: 8 cycles
    4. Floating point divide: 23 cycles
    5. Double precision divide: 36 cycles
See also
pm_sampling
pm_sampleACT
pm_sampleCCF
pm_sampleCor
pm_sampleCov
pm_sampleConv
pm_sampleECDF
pm_sampleMean
pm_sampleNorm
pm_sampleQuan
pm_sampleScale
pm_sampleShift
pm_sampleWeight
pm_sampleAffinity
pm_sampleVar
Intel Fortran Forum - Integer VS fp performance
Colorado State University tips on Fortran performance
Test:
test_pm_sampleMean


Final Remarks


If you believe this algorithm or its documentation can be improved, we appreciate your contribution and help to edit this page's documentation and source file on GitHub.
For details on the naming abbreviations, see this page.
For details on the naming conventions, see this page.
This software is distributed under the MIT license with additional terms outlined below.

  1. If you use any parts or concepts from this library to any extent, please acknowledge the usage by citing the relevant publications of the ParaMonte library.
  2. If you regenerate any parts/ideas from this library in a programming environment other than those currently supported by this ParaMonte library (i.e., other than C, C++, Fortran, MATLAB, Python, R), please also ask the end users to cite this original ParaMonte library.

This software is available to the public under a highly permissive license.
Help us justify its continued development and maintenance by acknowledging its benefit to society, distributing it, and contributing to it.

Author:
Fatemeh Bagheri, Thursday 12:45 AM, August 20, 2021, Dallas, TX

Variable Documentation

◆ MODULE_NAME

character(*, SK), parameter pm_sampleMean::MODULE_NAME = "@pm_sampleMean"

Definition at line 110 of file pm_sampleMean.F90.