Unit 2 - Digital Information (Last update: May 2016)

Click to Enlarge Course Snapshot.
Click to Enlarge

This unit further explores the ways that digital information is encoded, represented and manipulated. In this unit students will look at and generate data, clean it, manipulate it, and create and use visualizations to identify patterns and trends.

Many of the lessons that follow have worksheets and student guides associated with activities. Those worksheets are listed in the relevant lesson plan, or you can check out all unit 2 student-facing activity guides here. You can access a flat pdf of all the lessons in unit 2 here.

Chapter 1: Encoding and Compressing Complex Information

Big Questions

  • Are the ways in which digital information is encoded more laws of nature or man made?
  • What kinds of limitations does the binary encoding of information impose on what can be represented inside a computer?
  • How accurately can human experience and perception be captured or reflected in digital information?

Enduring Understandings

  • 1.1 Creative development can be an essential process for creating computational artifacts.
  • 1.3 Computing can extend traditional forms of human expression and experience.
  • 2.1 A variety of abstractions built upon binary sequences can be used to represent all digital data.
  • 3.3 There are trade offs when representing information as digital data.

Week 1

Lesson 1: Bytes and File Sizes


In this lesson students are introduced to the standard units for measuring the sizes of digital files, from a single byte, all the way up to terabytes and beyond. Students begin the lesson by comparing the size of a plain text file containing “hello” to a Word document with the same contents. Students are introduced to the units kilobyte, megabyte, gigabyte, and terabyte, and research the sizes of files they make use of every day, using the appropriate terminology. This lesson foreshadows an investigation of compression as a means for combatting the rapid growth of digital data.

Teacher Links: Answer Key Student Links: Activity Guide |

Lesson 2: Text Compression

Widget - Text Compression | Individual and Group Discovery

At some point we reach a physical limit of how fast we can send bits and if we want to send a large amount of information faster, we have to find a way to represent the same information with fewer bits - we must compress the data.

Teacher Links: Activity Recap Student Links: Activity Guide | Activity Guide | Video | Activity Guide |

Lesson 3: Encoding B&W Images

Widget - Pixelation | Concept Invention | Individual Creation

In this lesson, students will begin to explore the way digital images are encoded in binary. The class begins by asking students to invent their own image encoding protocol in order to familiarize themselves with some of the subtle complications of encoding images, namely the need for other data, called metadata, that describes properties of the image necessary for rendering it. Students will learn about pixels, raster images, and what an image file format is. Students will encode binary image data using a widget in Code Studio.

Teacher Links: Answer Key | Video Student Links: Activity Guide | Activity Guide | Video | Activity Guide |

Week 2

Lesson 4: Encoding Color Images

Widget - Pixelation | Individual Creation

In this lesson students are asked to consider how color is represented on a computer and to imagine how it might be encoded in binary. Students then learn about how color is actually represented on a computer - using the RGB color scheme - and create their own images in an new version of the pixelation widget that allows you use more than 1 bit per pixel to represent color information. After grappling with the prospect of possibly many bits just to represent a single pixel, students are shown how using hexadecimal allows us to represent many bits with fewer characters. Students use a new version of the pixelation tool to encode an image with color and create a personal favicon.

Teacher Links: Video | Answer Key | Answer Key | Answer Key Student Links: Video | Worksheet | Activity Guide | Activity Guide | Activity Guide | Rubric |

Lesson 5: Lossy Compression and File Formats


This lesson is mostly an investigation of different kinds of file formats that exist in the real world. The lesson begins with students exploring a mock “lossy” text compression scheme as a way to learn about “lossy” compression. Then we do a jigsaw “rapid research” activity in which pairs of student research a real image, text, or sound encoding file format and determine what kind of compression it uses and the theory behind it. This lesson also sets the stage for the practice Performance Task (Encode a Complex Thing) that follows this lesson.

Teacher Links: Answer Key Student Links: Worksheet | App Lab | | Video (lossy in second half)

Week 3

Lesson 6: Practice PT - Encode an Experience

Practice PT | Unplugged | Individual Creation

In this 2-day lesson, students will design their own way to encode a personal experience (such as attending a party, playing a game, etc). The project begins with students doing some top-down design to figure out the components and subcomponents of an experience that are encodable as binary information. Students then select a portion of the experience to flesh out into a a more detailed design. The project includes a written reflection questions similar to those students will see on the AP Performance Tasks. While students will complete this project individually, they will exchange feedback with a classmate at one point of the project.

Student Links: Activity Guide (includes Rubric) | | Project Templates

Chapter Commentary

This chapter will look and feel a lot like Unit 1, Chapter 1. It is in many ways a continuation of the concepts and activities around how to encode information. The difference now is that the information isn’t being encoded explicitly for the purpose of transmitting the data. Rather the focus on coming up with creative ways to encode types of information that less obviously lend themselves to binary encoding or discrete pieces of data at all.

Chapter 2: Manipulating and Visualizing Data

Big Questions

  • What is the relationship between data, information and knowledge?
  • What are the best ways to find, see, and extract meaningful trends and patterns from raw data?
  • Where and how does human bias affect the collection, processing and interpretation of data?

Enduring Understandings

  • 1.3 Computing can extend traditional forms of human expression and experience.
  • 3.1 People use computer programs to process information to gain insight and knowledge.
  • 3.2 Computing facilitates exploration and the discovery of connections in information.
  • 3.3 There are trade offs when representing information as digital data.
  • 7.1 Computing enhances communication, interaction, and cognition.
  • 7.3 Computing has a global affect -- both beneficial and harmful -- on people and society.

Week 4

Lesson 7: Introduction to Data

Unplugged | External Tools | Individual and Group Discovery

In this kickoff to the Data Unit, students begin thinking about how data is collected and what can be learned from it. To begin the lesson, students will take a short online quiz that supposedly determines something interesting or funny about their personality. Afterwards they will brainstorm other sources of data in the world around them, leading to a discussion of how that data is collected. This discussion motivates the introduction of the Class Data Tracker project that will run through the second half of this unit. Students will take the survey for the first time and be shown what the results will look like. To close the class, students will make predictions of what they will find when all the data has been collected in a couple weeks.

Teacher Links: Google Form Setup (includes template) Student Links: | Link

Lesson 8: Finding Trends with Visualizations

External Tools | Research | Presentation

Students use the Google Trends tool in order to visualize historical search data. They will need to identify interesting trends or patterns in their findings and will attempt to explain those trends, based on their own experience or through further research online. Afterwards, students will present their findings to ensure they are correctly identifying patterns in a visualization and are providing plausible explanations of those patterns.

Student Links: | Activity Guide | Link

Lesson 9: Check Your Assumptions

Research | Class Discussion

This lesson asks students to consider carefully the assumptions they make when interpreting data and data visualizations. The class begins by examining how the Google Flu Trends project tried and failed to use search trends to predict flu outbreaks. They will then read a report on the Digital Divide which highlights how access to technology differs widely by personal characteristics like race and income. This report challenges a widespread assumption that data collected online is representative of the population at large. To practice identifying assumptions in data analysis, students are provided a series of scenarios in which data-driven decisions are made based on flawed assumptions. They will need to identify the assumptions being made (most notably those related to the digital divide) and explain why these assumptions lead to incorrect conclusions.

Teacher Links: Answer Key Student Links: Activity Guide | Video |

Lesson 10: Good and Bad Data Visualizations

Analyzing Artifacts | Group Discovery | Class Discussion

This is a pretty fun lesson that has two main parts. First students warm up by reflecting on the reasons data visualizations are used to communicate about data. This leads to the main activity in which students look at some collections of (mostly bad) data visualizations, rate them, explain why a good one is effective, and also suggest a fix for a bad one.

Student Links: Link | Worksheet | |

Week 5

Lesson 11: Making Data Visualizations

External Tools | Individual Skill Building | Tutorial

Now that students have had the chance to see and evaluate various data visualizations, they will learn to make visualizations of their own. This lesson teaches students how to build visualizations from provided datasets. The levels in Code Studio provide a detailed walkthrough of how to use Google Sheets to create several different kinds of charts. While this lesson focuses on the Google Sheets tool, other tools may be substituted at the teacher’s discretion, and MS Excel support is coming soon to the lesson.

Teacher Links: Answer Key Student Links: Link | Data Set |

Lesson 12: Discover a Data Story

External Tools | Collaborative Artifact Creation | Writing

In this lesson, students will collaboratively investigate some datasets and use visualization tools to “discover a data story.” The lesson assumes that students know how to use some kind of visualization tool - in the previous lesson we used the charting tools of a basic spreadsheet program. Students should be working with a partner but without much teacher hand-holding. Most of the time should be spent with students poking around the data and trying to discover connections and trends using data visualization tools. It is up to them to discover a trend, make a chart, and accurately write about it.

Student Links: Activity Guide | Link | Rubric | Folder |

Week 6

Lesson 13: Cleaning Data

External Tools | Analyzing | Group Skill Building

In this lesson, students begin working with the data that they have been collecting since the first lesson of the chapter in the class “data tracker.” They are introduced to the first step in analyzing data: cleaning the data. Students will follow a guide in Code Studio, which demonstrates the common techniques of filtering and sorting data to familiarize themselves with its contents. Then they will correct errors they find in the data by either hand-correcting invalid values or deleting them. Finally they will categorize any free-text columns that were collected to prepare them for analysis. This lesson introduces many new skills with spreadsheets and reveals the sometimes subjective nature of data analysis.

Student Links:

Lesson 14: Creating Summary Tables

External Tools | Artifact Creation | Analyzing

In this lesson students learn how create their own summary tables from raw data. A summary table typically represents one or more aggregations (groupings of items) and computations that are performed on the raw dataset. In most spreadsheet programs, a summary table is called a pivot table. In the lesson, students learn how to make pivot tables in Google Sheets using a provided dataset. Then students turn to the data they’ve collected as a class and, with their partner, use pivot tables to investigate it further.

Student Links:

Lesson 15: Practice PT - Tell a Data Story

Practice PT | External Tools | Artifact Creation

For this Practice PT students will analyze the data that they have been collecting as a class in order to demonstrate their ability to discover, visualize, and present a trend or pattern they find in the data. Leading up to this lesson, students will have been working in pairs to clean and summarize their data. Students should complete this project individually but can get feedback on their ideas from their data-cleaning partner.

Student Links: Rubric |

Chapter Commentary

The lessons in this chapter often have two things going on at once. In the background the class is daily collecting some data about themselves (the “class data tracker project”) in order to accumulate data to process later on.

In the interim students are learning about and developing skills with spreadsheet and visualization tools. The teacher should connect the skills students are learning in the exercises to potential things they might do with the class data. The pedagogical “insight” behind the data tracker project is that because the students themselves are the subject of the data, and that they collected it themselves, students will have some natural intuitions about interesting avenues for investigation. We want to build toward the enduring understanding that (3.2) Computing facilitates exploration and the discovery of connections in information.

The tasks students perform in these lessons are done from the computer scientist’s perspective, looking at such things like making sure that data types match the ways we anticipate computing on them (don’t collect text when you need a number), cleaning data after it is inevitably “dirtied” and then performing some aggregations and visualizations to look for patterns. Along the way we need to understand how human bias can be introduced at each step so that we can accurately convey what any patterns in the data are or are not telling us. These activities help build toward the enduring understanding that (3.3) There are trade offs when representing information as digital data.