Planning an experiment with a simulation
We know it can be difficult getting on top of the important data analysis phase of an experiment. You’ve got a load of measurements, but how to arrange them? Maybe you’ve read our article on Tidy Data - this helps - but now there is another tool to assist you.
Our Simulations feature makes it possible to move the “how will I analyze this” part of the thinking to before you perform the experiment, which can result in a better planned experiment and ensure you collect the data that will be needed, in an immediately usable form.
How does it help?
You can make a Simulation that will generate data of the form that you might get from your experiment. This can be used as a template, or to sanity check, which measurements you plan to take.
You could even set it up to generate two sets of data:
What it would look like if the hypothesis is true
What it would look like if the hypothesis is false
This is important, and can really help things go more smoothly later on!
A worked example
Here’s an example. Let’s say you want to test if a given fertilizer makes certain plants grow taller. You’re going to:
Separate some seedlings into two groups
Give only one of them the fertilizer, and let them grow
Measure their heights.
Your hypothesis might be something like “Adding fertilizer makes the plants grow taller”.
So, how do we simulate this?
Making a model
You will first make a model of the experimental subject. This sounds complex, but it is fairly straightforward, and also ensures that you are thinking and planning in terms of variables and observations/samples. Read all about models here.
You can probably immediately see one variable you’ll need:
Variable 1: Categorical, named “Fertilizer used” with two possible values “yes” and “no”
What about the other one, your result?
Without a planning stage, you might just go ahead and measure the heights of each plant every day for two weeks. Then you’d be thinking about maybe plotting these as a graph of height per day for each plant …. Hmm. That’s a lot of data. How are you going to conclude something from that?
So let’s think.
Back to the hypothesis. What do you mean by “did the plants grow taller”? You’re not expecting to measure “growth velocity” or something complicated.
After some thinking, you come up with the thought that you want to see how much taller they are at the end of a two week period. That’s a nice, single numeric value that can be compared between groups. So now you have your second variable:
Variable 2: Numeric, named “Height at 2 weeks”
Great!
Now you have something you can model
You open up the Simulation Model editor, (more info here) and add the two variables you’ve just concluded you need.
For Height you can make some simple assumptions based on your knowledge of natural variability in such parameters. For example, you could decide that these simulated plants will have an average height of 10cm after 2 weeks, and they will vary with a normal distribution with a standard deviation of 2cm.
For Fertilized you can make a categorical variable and assign 50% likelihood to “yes” and “no”.
Having trouble? You can save your own copy of our example model here.
Your model is now ready.
You can now simulate the plants growing!
If you want to see what resulting data table would look like, you can now run a simulation on the model. In the Model Editor, you select “Simulate” from the left-hand menu:
You are now looking at the Simulator, with the Model embedded in it.
You’ll need to:
Select Height as the model Response variable
Select Fertilized as the model Predictor variable
Then, you can generate a set of samples for any number of plants. You’ll see a histogram of how their heights are distributed, and be able to group and color the data by the Fertilizer variable.
They’re not responding…
As you haven’t made any connection between the two variables in the model, your model is of a plant that does not respond to this fertilizer, of course.
Note: editing model connections requires a Teacher License, or our College Student license (DataClassroom U)
To simulate a plant that does respond to fertilizer you can save a copy of the simulation, call it something like “Plants responding to fertilizer” and in this new Simulation, you can edit the model and and add a Connection between the two variables (shown on this video). Click on the connection with Fertilizer as predictor variable and Height as Response variable:
You could for example set the connection to be like this:
Now, plants that get fertilizer will be on average 2cm taller than those without.
Having trouble? You can save your own copy of our example model here.
Save it, and go back to the simulator. Now this simulation will make something more interesting:
You are now generating example result data for your experiment (for both true and false hypotheses).
Having trouble? You can open our copy of this simulation here.
More possibilities - make a dataset!
You can also hit “Create dataset” and look at the data with all the features of the DataClassroom graphing tool and think about these questions:
What kind of visualization (graph) am I going to use?
What sort of statistical test* can I run on this data?
How many samples (plants) will I need in order to make a conclusion?
* Note: If it’s a T-test, Chi-square or Linear Regression test, you can also try an Interactive Analysis, to gain a deeper understanding of the math that will be going into running the statistical test, and what the result of the test actually means.
What have we achieved?
Let’s take a step back. What were we doing here? Just planning the experiment!
There should now be a higher chance that:
The experiment is a learning experience
We avoid frustration when at the end of two weeks, it’s unclear what to do with the data that have been collected
Data gets collected in the most usable table format - aka Tidy Data
Sound good?
If this sounds like what you’ve been waiting for, register for a free trial of DataClassroom or check out the website, have a browse through the searchable User Guide, watch one of our videos or get in touch and see how we can help!
Note: editing model connections requires a Teacher License, or our College Student license (DataClassroom U)