CDSLab Recipes

15 November 2021

Puzzle: Matchstick Wrong Equation

Problem Move just one matchstick in the following equation to make it hold.

equation matchstick puzzle

15 November 2021

Puzzle: How many and what living creatures do you see?

Problem How many and what living creatures can you identify in this picture?

equation matchstick puzzle

11 November 2021

Regression: Predicting the global land temperature of Earth in 2050 from the past data: Choosing the best model

Problem Consider this dataset, 1880_2020.csv, which contains the global land and ocean temperature anomalies of the earth from January 1880 to June 2020 at every month. As stated in the file, temperatures are in Degrees Celsius and reported as anomalies...

Excel PDF climate distribution function figure global least squares method line objective function plot probability probability density function random number regression visualization warming

05 November 2021

Regression: Estimating the parameters of a linear model for a Normally-distributed sample

Problem Supposed we have observed a dataset comprised of events with one attribute as in this file: z.csv. Plotting these points would yield a histogram like the following plot, Now our goal is to form a hypothesis about this dataset,...

Gaussian MATLAB MCMC Markov Chain Monte Carlo Normal distribution PDF ParaDRAM ParaMonte Python distribution distribution function figure line linear linear maximum likelihood method objective function plot probability probability density function random number regression uncertainty quantification visualization

28 October 2021

Regression: Estimating the parameters of a Normally-distributed sample

Problem Supposed we have observed a dataset comprised of $15027$ events with one attribute variable in this file: dataFull.csv. Plotting these points would yield a histogram like the following plot, Now our goal is to form a hypothesis about this...

Gaussian MATLAB MCMC Markov Chain Monte Carlo Normal distribution PDF ParaDRAM ParaMonte Python distribution distribution function figure line linear maximum likelihood method objective function plot probability probability density function random number regression uncertainty quantification visualization

20 October 2021

Computing the cross-correlation of sin() and cos()

Problem Generate two arrays corresponding to the values of $\sin(x)$ and $\cos(x+\pi/2)$ functions in the range $[0, 10\pi]$. Make a plot of the resulting arrays like the following illustration. Now use an FFT package in the language of your choice...

Python correlation cos covariance crosscorrelation periodic sample sin statistics

11 October 2021

Computing the cross-correlation of two data attributes

Problem Consider this dataset of carbon emissions history per country. Make a visualization of the global carbon emission data in the CSV file in the above by summing over the contributions of all countries per year to obtain an illustration...

CO2 Python carbon correlation covariance crosscorrelation sample statistics warming

11 October 2021

Computing the autocorrelation of a dataset

Problem Recall the globalLandTempHist.txt dataset that consisted of the global land temperature of Earth over the past 300 years. Also recall that the autocorrelation of a time-series is defined as the correlation of a univariate dataset with itself, with some...

Python autocorrelation correlation covariance sample statistics warming

11 October 2021

Computing and removing the autocorrelation of a dataset

Problem Consider the following Banana function. def getLogFuncBanana(point): import numpy as np from scipy.stats import multivariate_normal as mvn from scipy.special import logsumexp NPAR = 2 # sum(Banana,gaussian) normalization factor normfac = 0.3 # sum(Banana,gaussian) normalization factor lognormfac = np.log(normfac) #...

MCMC Monte Carlo ParaMonte Python autocorrelation correlation covariance sample statistics warming

08 October 2021

Ugly visualization

Problem What is ugly in the following graph?

histogram plot ugly visualization

08 October 2021

The population growths of the US states

Problem Which color scale has been used in the following visualization?

colorscale plot visualization

08 October 2021

The cities with the most and least moderate temperature

Problem Consider the following plot displaying the temperatures of a number of US cities. Which city’s temperature is the least varying throughout the year? Which city’s temperature is the wildest varying throughout the year? Which city the hottest in the...

coordinates periodic plot polar visualization

08 October 2021

Wrong visualization

Problem What is wrong in the following visualization?

density histogram kernel plot visualization wrong

08 October 2021

Excel Bar plot

Problem Consider the following salary data. Data Scientist | Physicist | Bioinformatician ---------------|-----------|----------------- $110,000 | $122,000 | $58,000 Make a graph of this data in Microsoft Excel similar to the following visualization.

density histogram kernel plot visualization wrong

08 October 2021

Visualization color scales

Problem Which classes of color scales the following color-mappings belong to? a) b) c) d)

colorscale plot visualization

05 October 2021

Regression: Model selection for a bivariate data using Excel

Problem Supposed we have observed a dataset comprised of events with two attributes $x$ and $y$ as in this file: data.xlsx. Plot this data in Microsoft Excel. Form a hypothesis about the relationship between $x$ and $y$. Use Excel’s Trendline...

Gaussian Normal distribution distribution exponential figure line linear logarithmic moving average plot polynomial random number regression visualization

05 October 2021

Cognitive Biases

Problem Suppose I have discovered a positive relationship between properties of some celestial objects, like the one formed by the black dots in the following figure. But in making such a discovery, I repeatedly and subconsciously throw away any data...

bias cognitive

01 October 2021

Visualizing and comparing the temperatures of Honolulu and Duluth

Problem Consider the following csv dataset containing the temperature of cities around the world from 1995 to 2020. Each row in the file corresponds to the average temperature (in Fahrenheit) of a city in a given day of the year....

CSV Duluth Hawaii Honolulu IO MATLAB Minnesota Python figure line mean pandas periodic plot read_csv variance visualization warming

01 October 2021

Visualizing and comparing the temperatures of Honolulu and Duluth via Excel

Problem Consider the following Excel dataset containing the temperature of two US cities Honolulu, HI and Duluth, MN from 1995 to 2020. There are two pages in the Excel file: Duluth, and Honolulu. Each row in the file corresponds to...

CSV Duluth Excel Hawaii Honolulu IO MATLAB Minnesota Python figure line mean pandas periodic plot read_csv variance visualization warming

01 October 2021

Visualizing the average precipitation of the US states vs. sunshine

Problem Consider the following dataset containing the average annual precipitation in the US states between 1971-2000 and this dataset. Combine these two datasets in Excel and generate a plot of US states precipitation vs. sunshine like the following figure. Note...

Excel data figure input output usa visualization