Problem

Consider this dataset, Drand.mat, which contains a set of random numbers. Write a sctipt that computes the mean of this sample via the Least-Sum-of-Squares method. For this, you will need to use a funtion minimizer in the language of your choice. Compare your results in the end with the simple method of computing the mean, which is the sum of all the values divided by the number of points.

Here is a best-fit Gaussian distribution using the most likely parameters to the histogram of this dataset.

Python

Name your main script findBestFitParameters.py. Here is an example expected output of such script,

findBestFitParameters.py
Optimization terminated successfully.
         Current function value: 100.868424
         Iterations: 21
         Function evaluations: 42

Least-Squares mean = -0.08197021484375
simple average formula = -0.08197189633971344
relative difference = 2.0513289127509662e-05

Start your parameter search via fmin() with the following value: $\mu = 10$.

Solution

Python

Here is an implementation of this code: findBestFitParameters.py,

#!python
#!/usr/bin/env python
import numpy as np
import seaborn as sns
from scipy.io import loadmat
from scipy.stats import norm
from scipy.optimize import fmin
sns.set()

# load MATLAB data file
Drand = loadmat("Drand.mat")
Data  = Drand["Drand"]


import matplotlib.pyplot as plt
fig = plt.figure( figsize=(9, 8) \
                , dpi= 75 \
                , facecolor='w' \
                , edgecolor='w' \
                ) # create figure object
ax = fig.add_subplot(1,1,1) # Get the axes instance

plt.hist(Data)

plt.show()

# find the parameters of Gaussian distribution

def getSumAbsDist(meanValue):
    sumDistSq = np.sum( np.abs(Data-meanValue) );
    return sumDistSq

bestMeanValue = fmin( func = getSumAbsDist
                    , x0 = 0
                    , xtol = 0.00001
                    , ftol = 0.00001
                    )
bestMeanValue = bestMeanValue[0]
simpleMean = np.sum(Data)/len(Data)
print( """
average via Sum of Least-Absolute Deviations = {}
simple average formula = {}
""".format( bestMeanValue
          , simpleMean
          )
     )
Optimization terminated successfully.  
     Current function value: 81.133610  
     Iterations: 14  
     Function evaluations: 32  

average via Sum of Least-Absolute Deviations = 0.0077500000000000025  
simple average formula = -0.08197189633971344  

Comparing this with the Least-Sum-of-Squared-Distances method, you will notice that the two methods yield almost identical results.

Comments