Census at School
Using secondary data and looking at associations
GAISE II: Level A, Ex 6
GAISE II (Guidelines for Assessment and Instruction in Statistics Education II) is a framework for statistics and data science education, developed in response to the need for data literacy in education. You can read more about their mission and goals here, page 1-3.
The following activity (data and questions) is adapted from and in some cases draws directly from the GAISEII framework. We have adapted these data and questions for use with DataClassroom.
While these data are a great foundation for starting conversations around statistical questions, we encourage students to gather their own data and use these practiced skills to answer questions important to them. (Find more information on using your own data in DataClassroom here).
In the answer key, if teacher notes are directly pulled from GAISE II, they are in italics.
Background
At a local elementary school, Mr. Johnson’s fifth grade class was surveyed on a couple of different topics. Mr. Johnson heard about CensusAtSchool, (an international classroom project that engages fourth- through twelfth-grade students in statistical problem solving using their own real data) and had his own students add their survey data to the site.
While the survey asked 13 different questions, we will be looking for any trends among just 5 answered questions. We are most interested in investigating trends between student height, arm span, foot length, means of getting to school, and how long it takes them to get to school.
Dataset
“Under the direction of their teachers, students in grades 4–12 anonymously complete an online questionnaire, thus submitting the data to a national database. The questions ask about such things as the length of their right foot, height, favorite subject in school, and how long it takes them to get to school (About census at school. Census at School - United States. (n.d.). https://ww2.amstat.org/censusatschool/about.cfm) The example data in this activity are typical of those collected through that online questionnaire.
They used this measurement guide to reduce variability with measurement data.
Variables
Student: This info variable identifies the student giving the information.
How you get to school: This categorical variable lists the method the surveyed student gets to school. Values include motor, walk, bus, and other.
Height (cm): This numeric variable measures the student height. Measured in centimeters.
Arm Span (cm): This numeric variable measures the distance of the students arm span (distance between middle fingertip on left hand to middle fingertip on right hand). Measured in centimeters.
Right Foot Length (cm):his numeric variable measures the length of the students right foot. Measured in centimeters.
Time it takes to get to school (min): This numeric variable references the time it takes students to get to school, using their mode of transportation. Measured in minutes.
Activity
Part 1: Looking at one Variable (In GAISEII, this is referred to as summary statistical investigative question)
Design a statistical question around one of the variables. Ex: How tall are the students in the class? Write the question below.
Use the Make a Graph tool to view the data for your statistical question. Screenshot it below:
3. Looking at the graph you created, how would you answer your question from #1?
Part 2: Comparing two variables (In GAISEII, this is referred to as comparison statistical investigative question)
4. Here is our investigative question: Do the students in this fifth-grade class who travel by bus tend to take longer to get to school than the students in this class who walk to school? Which two variables are we going to graph in order to answer this question?
5. Use the Make a Graph tool to create a graph for your statistical question. Select the categorical variable for the x-axis, and the numeric variable for y-axis. Screenshot your graph below:
6. Use your graph to answer the question from #4.
Optional Extension A. Using averages for comparing groups: Click the box for descriptive stats to add them to your graph. Select “mean based” if it is not already selected. Screenshot your new graph below:
What evidence do these mean-based stats provide for answering your question from #4? Does your answer in #6 change at all?
Part 3: Looking for an association (relationship) between two numeric variables (In GAISEII, this is referred to as association statistical investigative question)
7. Select two of the following numeric variables: arm span, height, foot length. Write a question that asks how they may be related (or associated) to each other. Ex: Is there an association between height and armspan for the students in this fifth-grade class? Write your question below:
8. Make a scatter plot graph of your two variables. Try switching which variable is the x-variable and which is the y-variable. Select which orientation you prefer. Screenshot it below:
9. Use your graph to answer your question from #7. Discuss the strength of association between the two variables as well as anything else you note about the graph.
10. To better describe the association between the variables you chose to graph in #8 add a line of best fit (linear) by adding a regression line. Screenshot your new graph (with a regression line) below:
11. What did adding the line of best fit show you about the association between your two variables?