Problem

Recall the globalLandTempHist.txt dataset that consisted of the global land temperature of Earth over the past 300 years. Also recall that the autocorrelation of a time-series is defined as the correlation of a univariate dataset with itself, with some positive lag $\tau$.

Use the definition of the correlation matrix that we have seen before to compute the autocorrelation of temperature anomaly of Earth starting from the first non-NAN value to the end, for all different possible lags. Make a plot of the autocorrelation vs. lag.

Now, use an external library in the language of your choice to compute the autocorrelation using Fast-Fourier Transform (FFT). Within Python, you can use correlate in SciPy package from scipy.signal import correlate to compute the autocorrelation. To do so, you will have to first normalize the input data (the anomaly data) to its mean. Then you pass the data in syntax like the following,

import numpy as np
from scipy.signal import correlate

anomalies = anomalies - np.mean(anomalies)

nlag = len(anomalies) - 1
acf = np.zeros(nlag)

acf = correlate ( anomalies
                , anomalies
                , mode = "full"
                )[nlag:2*nlag]
acf = acf / acf[0]

Make a plot of this autocorrelation function (acf) and compare with what you have obtained from the slow version you have implemented. Here is an illustration of the average anomaly data per year and its autocorrelation function,
globalTempAnomalies.png globalTempAnomaliesACF.png

Comments