Supposed we have observed a dataset comprised of events with two attributes $x$ and $y$ as in this file: data.xlsx.

1. Plot this data in Microsoft Excel.
2. Form a hypothesis about the relationship between $x$ and $y$.
3. Use Excel’s Trendline toolbox to fit your hypothesized model to this data.
4. Is is a good fit to data?
5. Try at least one other hypothesis for this dataset and fit the corresponding model to the observed trend in data.
6. Which hypothesis is a better fit to your data? The original or your alternative hypothesis?
7. Use the Excel Trendline again to obtain the equation for the model that seems to be a better fit to data.
8. Using this equation, compute the predicted $y$ values by the model for the corresponding $x$ values in the dataset.
9. Subtract the model-predicted $y$ values from the actual $y$ values in the data set. We call this fitting residuals.
10. Make a histogram of this fitting residual in Excel. Does the histogram of residuals look significantly asymmetric at all?
(Hint: If you have chosen a good model for your data, then this histogram should look fairly symmetric.)