De-mystifying the T-test

If you want to give your students some good insights into what a statistical test actually does, a good place to start is the DataClassroom Interactive Analysis of the T-test.

It takes about 5-10 minutes to perform the T-test, and on the way it introduces you to all the relevant concepts, and even asks you to do some thinking yourself!

You can read more about how to use this feature in the User Guide - this post is just a quick overview of some of the thinking that went into the pedagogical design of the interactive analysis. You can follow it by running our example analysis here.

Get them thinking

We decided to start with the big one - the P-value. The very first thing you get asked is to look at the data and estimate the P-value.


”What?” you are probably thinking. “I thought this was supposed to be a gentle introduction!”

True. But the best way to not get lost is to know where you are going, so what we’ve done here is given students a simple exercise:

Look at the graph and see what you think. Are the groups different?

Doesn’t sound so bad. And the interactive slider let’s them explore what different P-values would mean, using simple language to introduce the various levels of uncertainty that are the basis of the concept of significance.

Hopefully, they are still feeling comfortable as they click “Done: Add Notes”.

The Lab Notebook

After each step, we log some information about what just happened in the Lab Notebook. This is deliberately written in more formal language, and mentions concepts along the way that are not present in the dialog. Why?

Our thinking here is to subtly introduce this formal language - like Dependent Variable - and use it to describe a process that they just completed and (hopefully) understood. This is to both build confidence and provide the language they need to write reports in the future.

More interactive estimation and calculation

And so it continues. The student is asked to make estimations of the means and the standard deviations (explained simply) for the data, again with the aim of imparting that feeling: “Hey! I could do that!”

But of course, the computer does the tedious work of calculating them exactly. No need to use up any of that limited store of mental energy on something boring, right?

Signal versus Noise

Of course, once you get to things like group variance, the T statistic and so on, you can’t really ask students to do estimates. What we can do is present essential concepts visually, and key to almost all statistical tests is the concept of signal versus noise.

For the T-test, the signal is the difference between means, while the noise is the group variance. We show how these are weighed against each other, while doing the complicated math in the background.

We feel this weighting of signal vs noise is an essential part of much of the math, and indeed the meaning behind the results of the test.

Completing the circuit

Finally, we go full circle as we look up* the P-value from the T-score, which gets them a familiar sight: the P-value as a position on a slider, expressing a level of confidence, aka significance.

They can still drag the slider - and explore what other values might have meant. This conveys the reality that results of statistical tests aren’t black and white, but on a continuum of uncertainty, and that interpretation takes some thought.

They can also think about their earlier estimate, and how far off they were. Plenty of opportunity for discussion there!

Then it gets boring (we hope)

So that’s 5-10 minutes of work, getting some essential insights into how the T-test works along the way. It does of course also produce an accurate result, same as any statistical tool.

But after having done it 4-5 times, and having understood what’s going on, it will be boring. Then our goal has been achieved - they understand some stuff on a new and intuitive level, and it’s time to move on. Great!

For cranking out results, see the Graph-driven test. Much quicker, but still with that visual element.

Give it a go!

As mentioned above, you can go right into an analysis by opening one we already started (you can’t save your progress though). To start one of your own, open a Dataset (for the example above go to Cicada size vs sex (T-test) and click on Interactive Analysis.

The initial checklist is already filled out in the above example dataset, but if you start fully from scratch from a suitable dataset then there are a bunch of additional things to be learned by going through the checklist process.

You can choose to prepare datasets and save them with or without the checklist filled out, to give your students a variety of starting positions.

Sound good? Thoughts?

We’re convinced, and we hope you are too! Check out the User Guide section on Interactive Analyses for more detail.

Any comments, ideas or feedback to the DataClassroom tool, feel free to comment below, drop us an email or tweet. We built the tool using input from teachers, lecturers and students, and are always looking for valuable ideas!

Have fun!

The DataClassroom team


* NOTE: Looking up the P-value from the T-score can also be done visually, and is indeed done this way on the interactive Chi-square and Linear Regression analyses, which have a slider on the Probability Distribution Function. But we felt the T-test, being more ‘basic’ was better skipping this part. There is however the option to pop up an interactive ‘explainer’ showing how the lookup works.

Dan TempleComment