Supposed we have observed a dataset comprised of events with two attributes $x$ and $y$ as in this file: data.xlsx.
- Plot this data in Microsoft Excel.
- Form a hypothesis about the relationship between $x$ and $y$.
- Use Excel’s Trendline toolbox to fit your hypothesized model to this data.
- Is is a good fit to data?
- Try at least one other hypothesis for this dataset and fit the corresponding model to the observed trend in data.
- Which hypothesis is a better fit to your data? The original or your alternative hypothesis?
- Use the Excel Trendline again to obtain the equation for the model that seems to be a better fit to data.
- Using this equation, compute the predicted $y$ values by the model for the corresponding $x$ values in the dataset.
- Subtract the model-predicted $y$ values from the actual $y$ values in the data set. We call this fitting residuals.
- Make a histogram of this fitting residual in Excel. Does the histogram of residuals look significantly asymmetric at all?
(Hint: If you have chosen a good model for your data, then this histogram should look fairly symmetric.)