Problem

Consider this dataset, Drand.mat, which contains a set of random numbers. Write a sctipt that computes the mean of this sample via the Least-Sum-of-Squares method. For this, you will need to use a funtion minimizer in the language of your choice. Compare your results in the end with the simple method of computing the mean, which is the sum of all the values divided by the number of points.

Here is a best-fit Gaussian distribution using the most likely parameters to the histogram of this dataset.

Python

Name your main script findBestFitParameters.py. Here is an example expected output of such script,

findBestFitParameters.py
Optimization terminated successfully.
         Current function value: 100.868424
         Iterations: 21
         Function evaluations: 42

Least-Squares mean = -0.08197021484375
simple average formula = -0.08197189633971344
relative difference = 2.0513289127509662e-05

Start your parameter search via fmin() with the following value: $\mu = 10$.

Solution

Python

Here is an implementation of this code: findBestFitParameters.py,

#!python
#!/usr/bin/env python
from scipy.io import loadmat
import numpy as np
from scipy.stats import norm
from scipy.optimize import fmin


# load MATLAB data file
Drand = loadmat("Drand.mat")
Data  = Drand["Drand"]


import matplotlib.pyplot as plt
fig = plt.figure( figsize=(9, 8) \
                , dpi= 300 \
                , facecolor='w' \
                , edgecolor='w' \
                ) # create figure object
ax = fig.add_subplot(1,1,1) # Get the axes instance

plt.hist(Data)

plt.show()

# find the parameters of Gaussian distribution

def getSumDistSq(meanValue):
    sumDistSq = np.sum( (Data-meanValue)**2 );
    return sumDistSq

bestMeanValue = fmin( func = getSumDistSq
                    , x0 = 10
                    )
bestMeanValue = bestMeanValue[0]
simpleMean = np.sum(Data)/len(Data)
print( """
Least-Squares mean = {}
simple average formula = {}
relative difference = {}
""".format( bestMeanValue, simpleMean, 2*abs((bestMeanValue-simpleMean)/(bestMeanValue+simpleMean)) ) )
Optimization terminated successfully.
         Current function value: 100.868424
         Iterations: 21
         Function evaluations: 42

Least-Squares mean = -0.08197021484375
simple average formula = -0.08197189633971344
relative difference = 2.0513289127509662e-05

Comments