DataClassroom

View Original

Residuals and tests of normality

Two features we’d like to tell you about today:

  • Plotting residuals (and slopes) from regression lines

  • Plotting a reference normal curve on histograms, and testing for normality.

Okay, that was actually four things! Let’s go, starting with residuals:

Residuals are the distance that a data point is from the predicted line (or curve) of best fit. Check a box to see residuals shown as distance from a regression line

You can also save the residual values as a new column of data. Then you can plot the residuals on a new graph such as a histogram, so you can see if there are any patterns in the data.

And now, slopes:

Here you can illustrate the slope of the regression line at each point. And yes, you can also plot these values.

The slope is the first derivative or the rate of change at any point on a curve, so this means you can for example convert position values into speed, or speed values into acceleration. Or that kind of thing.

Read more in our User Guide article.

Histograms and normality

Change of scene - on to histograms. Ever wondered what your data would have looked like if it followed the normal (aka Gaussian) distribution? Wonder no more:

Charles Darwin would have liked this feature when looking at the beak lengths of his Galapagos finches

Tests of normality

Finally - we’ve also two statistical tests, the D’Agostino-Pearson and Shapiro-Wilk, which perform tests of normality.

So you can run these on your data (like the above) and see what the tests say about your data being normally distributed.

That’s all

As always, let us know by email if you have any burning desire for a new feature. We’ll definitely consider it, and we answer all our mail. Maybe we’re already working on it!