Lesson 4: Discover a Data Story

Overview

In this lesson, students will collaboratively investigate some datasets and use visualization tools to “discover a data story.” The lesson assumes that students know how to use some kind of visualization tool - in the previous lesson we used the charting tools of a basic spreadsheet program. Students should be working with a partner but without much teacher hand-holding. Most of the time should be spent with students poking around the data and trying to discover connections and trends using data visualization tools. It is up to them to discover a trend, make a chart, and accurately write about it.

Purpose

Being able to look at large sets of data and use visualization as a tool for discovery is a common task that many people who work with data do on a daily basis. A computer scientist should have decent facility with using tools opening and browsing large datasets, and doing some cursory exploration to see what’s there. The computer scientist should be familiar enough with the tools to, over time, develop some instincts about data, how it’s collected, the kinds of formats it comes in, and how that affects what can or cannot be done to visualize it.

Agenda

Getting Started (10 mins)

Activity (40 mins)

Wrap Up (10 mins)

Assessment

View on Code Studio

Objectives

Students will be able to:

  • Collaboratively investigate a dataset.
  • Create a visualization (chart) from provided data.
  • Identify possible trends or connections in a data set by creating visualizations of it.
  • Accurately communicate about a visualization of their own creation.

Preparation

Links

Heads Up! Please make a copy of any documents you plan to share with students.

For the Students

Teaching Guide

Getting Started (10 mins)

Fill out class tracker survey

Survey Reminder: Give students a few minutes to fill out the class tracker survey that you started in Lesson 7 - Introduction to Data.

Visualization as a discovery tool

Remarks

In the previous lesson, we learned how to use a data visualization tool to create a visualization. Sometimes data in its raw state is simply too big to be able to look at and derive any meaning. Even when the data is summarized in a table, it can be difficult to “see” what the data shows.

Today we're going to see how visualizing data can be a useful tool for discovery. In today’s activity, you and a partner will investigate some sets of data on your own and use visualization to discover a connection or trend.

Quick Investigation of a sample dataset

For today's work there are several datasets for you to choose from.

  • We’re going to take 5 minutes to poke around in one of the datasets to see how it’s structured.
  • Then we’ll come back together to get some terms straight before discovering further.

Go to Code Studio

  1. Find the link to the “Personality” dataset and open the folder.
  2. Find and open the README file.
  3. Find and open the rawData.csv file.
  4. Find and open one other .csv file - there are a few.

Discuss: What's in the folder for a dataset?

After students have had a few minutes to poke around, make sure the group understands what these files are.

You can use a think-pair-share or a simple whole group discussion to get the details out.

Ask the questions below, explanations are provided for you

"What’s the README file?"

  • Most datasets, when you download them, contain a README file.
  • The README file is just a plain text document that gives some background information about the dataset, how it was collected, and what the column headings mean.
  • The README is a good first stop when trying to understand exactly what a dataset contains.

"What’s the rawData.csv file?"

  • For the datasets we provide, each folder contains a "raw" dataset, which is the original data, as it was collected.
  • Recall that .csv stands for “comma-separated values.” CSV is a common, plain text format for distributing datasets.

"What’s in the other CSV files?"

  • The other files are what we call "summary tables."
  • These are tables that were created by running some computations on the raw data to do things like count, average, sum, compare, and categorize the data in interesting ways.
  • It is likely that these summary tables will be the data you use to create your visualizations.

Activity (40 mins)

Discover a Data Story

Teaching Tip

A Note on distributing the Activity Guide

The first section of the activity guide contains the instructions above. It’s suggested that students start exploring the datasets before you distribute the activity guide so they don’t lose momentum.

You might choose to assign the datasets to groups. This cuts down on student choice, but might save time if students are taking a while to settle on which dataset they want to use.

While students are working:

  • Remind students of the existence of the guide: Data Visualization 101: How to design charts and graphs - Link.
  • Most of the students’ time should be spent on working collaboratively to visualize data in different ways.
  • Encourage and remind students that an “interesting” finding doesn’t necessarily mean finding something world-changing or mind-blowing. The data is so big and hard to “see" that simply making a clear chart that gives some kind of view into the data is interesting

Pair: Put students in pairs or small groups to explore the datasets

Remarks

With your partner explore the datasets and choose one you'd like to learn more about. Make sure you

  • Read the README to understand the raw data that was collected
  • Look at the summary tables provided for your dataset.
  • Repeat these steps with additional datasets
  • Choose one to explore more deeply.

Discover a Data Story

Distribute: Activity Guide - Activity Guide - Discover a Data Story - Activity Guide

There is a link to this guide in Code Studio.

You may choose to have students make their own digital copies of this document and work on it there as well.

The activity guide asks students to:

  • Pick a dataset

  • Use visualization tools to “discover a data story”

  • Prepare one (or two) to present

  • Respond to prompts

Wrap Up (10 mins)

Share your data stories

Teaching Tip

For student sharing, there are a number of different things you could do, depending on your needs and classroom dynamic. Here are a few suggestions.

  • Have groups that used the same dataset share with each other.
  • Have each group share with one or two groups who used a different dataset.
  • Highlight one or two pairs’ work by asking them to present to the whole class.

Have students share their data stories with each other or with the whole class. A pair should:

  • Show the visualization they made.
  • Explain what it shows.
  • Explain the possible story it tells.

Assessment

Assessment Posibilities

Use the rubric to score the activity guide

You may choose to collect the second page of the Activity Guide and score it using the Rubric - Discover a Data Story - Rubric provided.

Note: Collecting and scoring the Activity Guide is optional.

  • The intent of this activity is NOT to make a huge project out of it.
  • The goal is simply to come away with some artifact that you might assess.
  • It might be sufficient for students to share what they created in class rather than submitting the worksheet.

Personal Reflection: Collaboration

This prompt is also provided on Code Studio

(NOTE: The following is modification of one of the prompts given on the AP Create Performance task.)

Prompt: Describe the development process of discovering your data story and creating a visualization. Describe the difficulties and/or opportunities you encountered along the way, and describe the collaborative process between you and your partner.

Please limit your response to about 200 words.

  • Check Your Understanding
  • 2
  • (click tabs to see student view)
View on Code Studio

Student Instructions

NOTE: The following is modification of one of the prompts given on the AP Create Performance task.

Describe the development process of discovering your data story and creating a visualization. Describe the difficulties and/or opportunities you encountered along the way, and describe the collaborative process between you and your partner.

Please limit your response to about 200 words.

Standards Alignment

View full course alignment

Computer Science Principles

1.1 - Creative development can be an essential process for creating computational artifacts.
1.1.1 - Apply a creative development process when creating computational artifacts. [P2]
  • 1.1.1A - A creative process in the development of a computational artifact can include, but is not limited to, employing nontraditional, nonprescribed techniques; the use of novel combinations of artifacts, tools, and techniques; and the exploration of personal cu
  • 1.1.1B - Creating computational artifacts employs an iterative and often exploratory process to translate ideas into tangible form.
1.2 - Computing enables people to use creative development processes to create computational artifacts for creative expression or to solve a problem.
1.2.1 - Create a computational artifact for creative expression. [P2]
  • 1.2.1A - A computational artifact is anything created by a human using a computer and can be, but is not limited to, a program, an image, audio, video, a presentation, or a web page file.
  • 1.2.1B - Creating computational artifacts requires understanding and using software tools and services.
  • 1.2.1C - Computing tools and techniques are used to create computational artifacts and can include, but are not limited to, programming IDEs, spreadsheets, 3D printers, or text editors.
1.2.2 - Create a computational artifact using computing tools and techniques to solve a problem. [P2]
  • 1.2.2A - Computing tools and techniques can enhance the process of finding a solution to a problem.
1.2.4 - Collaborate in the creation of computational artifacts. [P6]
  • 1.2.4A - A collaboratively created computational artifact reflects effort by more than one person.
1.2.5 - Analyze the correctness, usability, functionality, and suitability of computational artifacts. [P4]
  • 1.2.5D - The suitability (or appropriateness) of a computational artifact may be related to how it is used or perceived.
1.3 - Computing can extend traditional forms of human expression and experience.
1.3.1 - Use computing tools and techniques for creative expression. [P2]
  • 1.3.1E - Computing enables creative exploration of both real and virtual phenomena.
3.1 - People use computer programs to process information to gain insight and knowledge.
3.1.1 - Use computers to process information, find patterns, and test hypotheses about digitally processed information to gain insight and knowledge. [P4]
  • 3.1.1D - Insight and knowledge can be obtained from translating and transforming digitally represented information.
  • 3.1.1E - Patterns can emerge when data is transformed using computational tools.
3.1.2 - Collaborate when processing information to gain insight and knowledge. [P6]
  • 3.1.2A - Collaboration is an important part of solving data driven problems.
  • 3.1.2B - Collaboration facilitates solving computational problems by applying multiple perspectives, experiences, and skill sets.
  • 3.1.2C - Communication between participants working on data driven problems gives rise to enhanced insights and knowledge.
  • 3.1.2D - Collaboration in developing hypotheses and questions, and in testing hypotheses and answering questions, about data helps participants gain insight and knowledge.
  • 3.1.2F - Investigating large data sets collaboratively can lead to insight and knowledge not obtained when working alone.
3.1.3 - Explain the insight and knowledge gained from digitally processed data by using appropriate visualizations, notations, and precise language. [P5]
  • 3.1.3A - Visualization tools and software can communicate information about data.
  • 3.1.3B - Tables, diagrams, and textual displays can be used in communicating insight and knowledge gained from data.
  • 3.1.3C - Summaries of data analyzed computationally can be effective in communicating insight and knowledge gained from digitally represented information.
  • 3.1.3D - Transforming information can be effective in communicating knowledge gained from data.

CSTA K-12 Computer Science Standards (2017)

DA - Data & Analysis
  • 3A-DA-11 - Create interactive data visualizations using software tools to help others better understand real-world phenomena.
  • 3B-DA-05 - Use data analysis tools and techniques to identify patterns in data representing complex systems.
  • 3B-DA-06 - Select data collection tools and techniques to generate data sets that support a claim or communicate information.
  • 3B-DA-07 - Evaluate the ability of models and simulations to test and support the refinement of hypotheses.