In this chapter, we’ll review some basic concepts in both arithmetic and algebra that you likely already know but perhaps have not used in some time.
“But I’m not a math person”, you might say. That’s okay, you don’t need to be. We’re only looking at some very simple concepts. It’s all very chill.
In order to prepare you for the “applied” portion of this workshop, we’ll briefly introduce spreadsheet software like Microsoft Excel and Google Sheets, as well as define “tabular” data, or data made up of rows and columns.
Instructions: Observe the following equation. What does it equal?
5 + 5 * 2
If you answered “15”, you’re exactly right.
If you answered “20”, that’s okay, since I expected that you might. But why?
The Order of Operations describes the order in which we perform mathematical operations, like addition (+), subtraction (-), multiplication (*), division (/), parentheses, and exponents (“exponentiation”).
If we follow the Order of Operations, we know that multiplication (*) is calculated first (5 * 2 = 10), then addition (10 + 5 = 15).
Please excuse her (Aunt Sally). To this day, I’ve not the foggiest idea why Aunt Sally was so unusual. Regardless, she’s wonderful for remembering the Order of Operations. That is, PEMDAS, or Please Excuse My Dear Aunt Sally, helps us remember in what order we perform mathematical operations.
P for Parentheses, or ( and ).
E for Exponents, or 2^2
M for Multiplication, or *
D for Division, or /
A for Addition, or +
S for Subtraction, or -
Whatever helps you remember (Source: XKCD).
Observe the following expressions. Let’s explore why they equal what they equal.
2 * (10 + 2)
Answer: “24”.
Per PEMDAS, we calculate the expression in parentheses (P) first ((10 + 2) = 12).
Then, we multiply (M) the result by “2” (12 * 2 = 24).
24 - 20 / 4
Answer: “19”.
Per PEMDAS, we divide (D) first (20 / 4 = 5).
Then, we subtract (S) second (24 - 5 = 19).
Not too difficult, yeah?
Instructions: Use the Order of Operations to solve the following calculations.
6 + 3 - 1
10 - 4 / 2
4 * 2 / 4
15 / (2 + 1)
10 * 10 / 25
Challenge Questions: Calculate the following if you’d care for a challenge.
Note: The ^ operator indicates exponentiation, meaning “to the power of”.
25 * 2 / 10 + (5 * 2)
(10 + 15 + (5 * 5)) / 10
3^2 * 2 / (1 + 2)
Instructions: Observe the following. What is the answer?
Scenario: You spent $150 on Chick-fil-A last month. Each meal costs $10.
Question: How many times did you eat at Chick-fil-A?
If you answered “10”, that’s exactly right.
Reformulated, this might be: $150 / x = $10. This is algebra!
Future you.
Source: The Hangover (2009).
Algebra comes from the Arabic al-jabr, or “to make whole”.
When you find the missing piece in a calculation, you “make [it] whole”.
It’s especially common in applied analytics and data tables.
The following formula is used to determine the perfect size when buying a new television.
(Unless you’re like me and sit as close as possible without kicking your sub woofer).
TV = D / 2.5
Here, TV and D are variables that represent the following:
TV: The size of your new TV, in inches
D: The distance you’ll sit from your TV, in inches
Yes, this is a thing.
Scenario: Suppose you’ve purchased an incredibly nice TV.
It’s so nice that you’ll decide your next apartment based on its ability to accommodate your TV.
Let’s use algebra to figure out the ideal apartment space.
| TV Size (In.) | Space v. TV Size (In.) | TV * 2.5 In. | Living Room Space (In.) |
|---|---|---|---|
| 70 | 2.5 | 175 | 120 |
| 70 | 2.5 | 175 | 180 |
| 70 | 2.5 | 175 | 260 |
| 70 | 2.5 | 175 | 240 |
| 70 | 2.5 | 175 | 320 |
Conclusion: With a little bit of algebra, we can find a near-ideal apartment to fit our ridiculous TV.
Instructions: Solve for x in the following equations.
25 + x = 100
32 / (4 + 4) / x = 1
10 * (x - 3) = 20
2 - x = -8
(12 * 3) + x = 40
Challenges: Use the formulas provided below to calculate the correct answer.
BMI = (703 * W) / (H * H)
W represents weight, in pounds
H represents height, in inches
What is the BMI of a 160 lb. individual who is 5’10" tall?
T = C * R
C represents cost, in dollars
R represents rate of tipping, in percent
What is a 15% tip for a meal costing $43.12?
What is an 18% tip for a meal costing $16.49?
D = 1,100 * T
D represents distance, in feet
T represents time between flash and thunder, in seconds
How far away is a lightning strike if you hear thunder after 12 seconds?
A data table is a common format for storing data, and it is comprised of rows, columns, and values.
“But I use a table to eat breakfast and gamble”, you might say, “why is it called a table?”
“Table” comes from the Latin tabula, meaning a “plank” or “tablet” - both for all manner of table-related things, like eating breakfast, as well as for recording information.
“Tables” aren’t just for eating and gambling - like tablets, they preserve information.
Source: Business Insider
Rows comprise horizontal sequences of data, that is, from left to right. You can think of rows in the same way as rows in a movie theater, e.g. “front-row seats”.
Rows typically represent an individual in a group. For example, a class roster might have a row for each student, while a shopping list might have a row for each item to buy.
“Rows” in a movie theater. They’re horizontal.
Source: Inns of Monterey
Other names for rows include observations, records, instances, subjects, and trials.
Columns comprise vertical sequences of data, that is, from top to bottom. You can think of columns as physical columns in Greek architecture, as seen in the White House in Washington, D.C. On a “Bingo” card, “B”, “I”, “N”, “G”, and “O” each represent a column.
Columns typically represent some quality or attribute that may be different from individual to individual. For example, a calendar’s columns each represent a day of the week, while each row represents a different week in a month.
“Columns” as architecture. They’re vertical
Source: Wikipedia
Other names for columns include variables, fields, and features.
Since rows and columns intersect, each intersection contains a value. For example, if each row in a table represents a hospital patient, and one column contains the weight of each patient, the value might be “137 lbs”.
In a spreadsheet, this intersection is called a cell.
Some values may not be available in a table. These are called missing values. Like a calendar, where each row is a week and each column is a weekday, the value is the day of the month.
Each piece on a chess board represents a value. In column B, row 8, the value is “knight”.
Source: Chess-Space.com
Rows and columns, combined, create a data table, or tabular data.
As Craig Mack once spit in Flava in Ya Ear:
I got the data to turn your body into anti-matter.
Spreadsheet programs like Microsoft Excel and Google Sheets are designed specifically for tabular data.
Tabular data.
Remember all of the fun times you had playing Battleship as a kid? Me neither.
But it’s probably still more fun than math, so let’s try it out.
Milton Bradley’s less prejudicial, more progressive box cover.
Source: HiLoBrow
Instructions: Observe the following board layout (from Wikipedia) and select the correct choice.
Note: The grey cells indicate the location of a battleship.
F6J3A8I9D2J10C1:E1F6:F7C10:F10Challenge: Provide the range of cells that sinks the largest battleship.
Now that you better understand rows, columns, values, and data tables, you can look at how data are stored in spreadsheets in tabular format.
Spreadsheets are computer programs designed to help organize, manage, and change information, usually in the form of numbers and text. There are various programs for using spreadsheets, but the most common two are Microsoft Excel and Google Sheets. Here, you’ll learn about the different parts that make up a spreadsheet.
Note: Spreadsheets can actually lie.
Source: Dilbert
In both Microsoft Excel and Google Sheets:
A, B, C, and so on1, 2, 3, and so forthA Microsoft Excel table of unique words by hip hop artist.
Source: The Pudding
The same table, albeit in Google Sheets.
In both Excel and Sheets, you specify a cell by combining the column and row labels. For example:
The cell in the first column (A) and first row (1) is A1
The cell in the third column (C) and second row (2) is C2
When you click on a cell, its value appears in the above formula bar.
The value in a selected cell appears in the top “formula bar”.
Always make sure you point to the right cells.
Source: Dilbert
A range is the selection of more than one cell. The starting and ending cells are combined with a colon, :. For example:
The first 3 cells in row 1 are labeled A1:C1
The first 3 cells in row 2 are labeled A2:C2
The first 5 cells in column A are labeled A1:A5
The first 3 cells in rows 1 to 3 and columns A to E are labeled A1:E3
You can click and drag to select a range of cells.
A range of cells selected at the same time - here, A5:B11.
In spreadsheets, formulas are used to transform values in one or more cells, and transformed values will then appear in new cells. All formulas:
Are entered in the upper formula bar
Begin with an “equals” sign, or =
Output changed values in a newly selected cell or cells
Act on hard values, e.g. 102, or flexible values, e.g. C15
Observe the following. Here, we take the value in cell B2 and add 1,000 in the formula bar. That is:
Note that the selected cell where the new value will appear is also selected: C2.
A basic addition formula, adding 1,000 from B1 and placing the new value in C1.
By pressing “Enter” or “Return”, the new value now appears in cell C2.
When C2 is selected, the formula bar now shows the formula, not the value.
Our new value appears in the selected cell as a result of the formula.
We can use the same formula, over and over, on a range of cells.
Excel and Sheets cells, when selected, have a handle in the bottom-right corner.
These handles looks like little squares.
Note the little green square in the selected cell - this is the “handle”.
By clicking and dragging the formula across cells, we apply the formula to all respective cells.
Click and drag the “handle” to apply a formula to a range of cells.
In Excel and Sheets, as well as programming languages, a function is a specific operation that transforms one or more values in a formula.
There are many different functions for many different tasks
Formulas with functions begin with an “equals” sign, or =
Function names are followed by parentheses, or ()
Inside the parentheses, (), you specify which cell or cells you want to change, e.g. A1:C3 or D2:D14.
You can also modify a function inside the parentheses - these are called arguments.
Some spreadsheet functions are actually quite powerful.
Source: XKCD
Let’s look at the example function, SUM(), which adds two or more values together.
Inside the parentheses, our arguments are simply the values or cell names we want to add.
Question: What value will the following formula produce?
If you answered “10”, that’s exactly right.
This is because the formula = SUM(2, 3, 5) is the same as = 2 + 3 + 5.
Of course, 2 + 3 + 5 = 10.
Similarly, we can specify the names of the cells in our formula, like A2 or C9.
We can also insert one or more ranges, like C2:C15.
Instead of manually typing the names of cells, you can just click or click and drag them when inside the parentheses, ().
Functions are used within formulas and operate on both “hard” numbers and cells.
Instructions: So far, we’ve been using the first 15 hip hop artists from the text analysis by The Pudding, “The Largest Vocabulary in Hip Hop” (Daniels, 2019).
Download the full Microsoft Excel workbook with unique hip hop lyrics by clicking here.
Alternatively, open and copy the Google Sheets spreadsheet by clicking here.
Note: The unique word count of each hip hop artist is based on the first 35,000 words out of all of their lyrics.
In column C, create a new formula that divides column B by 35,000. It should look like this:
This will calculate the proportion (or percentage) of unique lyrics for a given artist.
Then, apply the formula to all artists by clicking and dragging the cell handle.
What proportion of Rakim’s lyrics are unique?
What proportion of Hopsin’s lyrics are unique?
What is the proportional difference between Immortal Technique’s and Aesop Rock’s vocabulary?
Challenge: After finding the proportions of each artist’s unique lyrics, use function AVERAGE() to find the average proportion of lyrics among all artists.
By how much does Ludacris’ proportion of unique lyrics differ from the average?
By how much does Logic’s proportion of unique lyrics differ from the average?