Chapter 0 Preliminaries and the Coverage

Do the following problems to see if you have met the basic mathematics requirements.

  1. Solve the equation \(3x-7=26\).

  2. Solve the equation \(\frac{12}{25}=\frac{x}{15}\).

  3. Solve the equation \(\frac{18}{27}=\frac{14}{x}\).

  4. Solve the equation \(\frac{17+1}{27+1}=\frac{8+1}{x+1}\).

  5. Solve the equation \(256 = 64(1+50r)\).

  6. Solve the equation \(145 =M\cdot \frac{(1+0.2)^{12}-1}{0.2}\).

  7. Solve the equation \(y=13(y-24)\).

  8. Evaluate \(3a-4(52-4b)\) for \(a = -5\) and \(b=-2\).

  9. Let the set \(A\) be \(\{3, 1, 12\}\). Write down all subsets of it. Use \(\{ ~~\}\) to denote the empty set.

  10. Evaluate \(\frac{(11-8)^2+(8-9)^2+(0-8)^2+(6-8)^2+ (15-8)^2}{5-1}\).

  11. Reduce \(\frac{0.15}{0.75}\) and \(\frac{0.46}{0.69}\) by hand.

  12. The expression such as \(25\cdot (0.2)+ 34\cdot (0.3)+42\cdot (0.5)\) is called the weighted average of the three numbers 25, 34, and 42, with weights 0.2, 0.3, and 0.5, respectively. Now, evaluate this weighted average: \(15\cdot (0.1)+ 12\cdot (0.3)+ 20\cdot (0.3)+25\cdot (0.1)+8\cdot (0.2)\). Keeping mind, the sum of weights in a weighted average must equal 1.

  13. Evaluate \((-15)(\frac{1}{8})+(5)(\frac{2}{8})+(2)(\frac{2}{8})+(4)(\frac{3}{8})\).

This course will introduce the students various applications of mathematics to real-life problems, such as

The solution to the above 13 problems.

  1. Solve the equation \(3x-7=26\).

    Adding 7 to both sides yields \(3x=33\). Dividing both sides by 3 gives \(x = 11\).

  2. Solve the equation \(\frac{12}{25}=\frac{x}{15}\).

    Multiplying both sides by 15 gives \(\frac{12}{25}\cdot 15=x\), or \(\frac{12}{25}\cdot \frac{15}{1}=x\). Canceling 5 gives \(\frac{12}{5}\cdot \frac{3}{1}=x\), or \(x=\frac{36}{5}=7\frac{1}{5}=7.2\).

  3. Solve the equation \(\frac{18}{27}=\frac{14}{x}\). Canceling 9 in the left side of the equation gives \(\frac{2}{3}=\frac{14}{x}\). Cross-multiplying gives \(2x=42\), or \(x=21\).

  4. Solve the equation \(\frac{17+1}{27+1}=\frac{8+1}{x+1}\). Simplifying the equation to produce \(\frac{18}{28}=\frac{9}{x+1}\). Canceling 2 in the left side yields \(\frac{9}{14}=\frac{9}{x+1}\). So, \(14=x+1\) or \(x=13\).

  5. Solve the equation \(256 = 64(1+50r)\). In order to isolate \(r\), we divide both side by 64. This gives \(4 = (1+50r)\). Subtracting 1 from both sides gives \(3=50r\). Dividing by 50 gives \(r=\frac{3}{50}=0.06\).

  6. Solve the equation \(145 =M\cdot \frac{(1+0.2)^{12}-1}{0.2}\).

    Simplifying the right side of the equation gives \(145=M\cdot 39.58050224\) (keep as many decimal places as possible). Solving M gives \(M=3.66\) (The final answer can have two decimal places if not specified).

  7. Solve the equation \(y=13(y-24)\).

    Since both sides have the unknow \(y\), we need to break the paretheses using the distributive law. This gives \(y = 13y-13\cdot 24\). Subtracting \(13y\) from both sides gives \(-12y=-13\cdot 24\). Canceling \(-12\) gives \(y=26\).

  8. Evaluate \(3a-4(52-4b)\) for \(a = -5\) and \(b=-2\).

    \(3a-4(52-4b)\) = \(3a-208+16b\) (I used the distributive law)=\(3(-5)-208+16(-2)=-15-208-32=-255\).

  9. Let the set \(A\) be \(\{3, 1, 12\}\). Write down all subsets of it. Use \(\{ ~~\}\) to denote the empty set.

  1. Evaluate \(\frac{(11-8)^2+(8-9)^2+(0-8)^2+(6-8)^2+ (15-8)^2}{5-1}\).
$$\frac{(11-8)^2+(8-9)^2+(0-8)^2+(6-8)^2+ (15-8)^2}{5-1}=\frac{9+1+64+4+49}{4}=\frac{127}{4}=31.75$$
  1. Reduce \(\frac{0.15}{0.75}\) and \(\frac{0.46}{0.69}\) by hand.
$$\frac{0.15}{0.75}=\frac{15}{75}$ \text{(multiplied 100 on both top and bottom)}=\frac{1}{5}$$ 

$$\frac{0.46}{0.69}=\frac{46}{69}=\frac{2{{3}\text{(divided top and bottom by 23)}$$
  1. The expression such as \(25\cdot (0.2)+ 34\cdot (0.3)+42\cdot (0.5)\) is called the weighted average of the three numbers 25, 34, and 42, with weights 0.2, 0.3, and 0.5, respectively. Now, evaluate this weighted average: \(15\cdot (0.1)+ 12\cdot (0.3)+ 20\cdot (0.3)+25\cdot (0.1)+8\cdot (0.2)\). Keeping mind, the sum of weights in a weighted average must equal 1.
$$1.5+3.6+6+2.5+1.6=15.2$$
  1. Evaluate \((-15)(\frac{1}{8})+(5)(\frac{2}{8})+(2)(\frac{2}{8})+(4)(\frac{3}{8})\). \[(-15)(\frac{1}{8})+(5)(\frac{2}{8})+(2)(\frac{2}{8})+(4)(\frac{3}{8})=\frac{-15}{8}+\frac{10}{8}+\frac{4}{8}+\frac{12}{8}=\frac{11}{8}\]

Chapter 1. The Mathematics of Elections

The Basic Elements of an Election

  • The candidates: people or other things

  • The voters: people who get a say in the outcome of an election

  • The ballots: single-choice ballot, preference ballot, and truncated preference ballot (book page 4)

  • The outcome: 4 types (winner-only, partial ranking, full ranking)

  • The voting method: the method used to tabulate the ballots and produce the outcome

In an election, ties can occur (book page 9). Different tie-breaking procedures may be adopted in different situations. We will not go into any detail of this.

We focus on preference ballots. Here are a set of 37 collected preference ballots: collectedBallots

When you have preference ballots, you need to create a preference schedule (book page 8), a table in which rows are choices and columns are numbers of different orderings of candidates. Here is a preference schedule:

preferenceSchedule

The following example will help you create such a table.

Example of a preference ballot

Watch this video: preference ballot (right click and choose “Open link in new tab”) and this: https://www.youtube.com/watch?v=4_6sUpQtwd0

Questions:

  1. How many first-choice votes are needed for a majority?

  2. The smallest number for a majority is ____ votes.

  3. Which candidate(s) had the fewest last-choice votes?

The answers are 37, 19 (since half of 37 is 18.5), and B&C.

Example of results from a preference ballot.

Refer to the following format for a preference ballot: The names of the candidates are printed on the ballot in some random order, and the voter is simply asked to rank the candidates [for example, see Fig. 1.1(c) of textbook]. For ease of reference we call this the “printed-names” format. (This format makes it easier on the voters and is useful when the names are long or when a misspelled name invalidates the ballot. The main disadvantage is that it tends to favor the candidates who are listed first.)

An election is held using the “printed-names” format for the preference ballots. The table below shows the results of the election. Rewrite it in the conventional preference schedule format used in the text. (Use \(A, B, C, D\), and \(E\) as shorthand for the names of the candidates.)

Solution.

The above table shows that

  • 37 voters place \(B\) as the first, \(E\) as the second, \(A\) as the 3rd, \(C\) as the fourth, and \(D\) as the fifth. This gives the second column in the following table.

  • 36 voters place \(A\) as the first, \(B\) as the second, \(D\) as the 3rd, \(C\) as the fourth, and \(E\) as the fifth. This gives the third column in the following table.

  • 24 voters place \(B\) as the first, \(A\) as the second, \(D\) as the 3rd, \(C\) as the fourth, and \(E\) as the fifth. This gives the fourth column in the following table.

  • 13 voters place \(E\) as the first, \(B\) as the second, \(C\) as the 3rd, \(A\) as the fourth, and \(D\) as the fifth. This gives the fifth column in the following table.

  • 5 voters place \(C\) as the first, \(E\) as the second, \(A\) as the 3rd, \(D\) as the fourth, and \(B\) as the fifth. This gives the sixth column in the following table.

Number of voters 37 36 24 13 5
1st \(B\) \(A\) \(B\) \(E\) \(C\)
2nd \(E\) \(B\) \(A\) \(B\) \(E\)
3rd \(A\) \(D\) \(D\) \(C\) \(A\)
4th \(C\) \(C\) \(C\) \(A\) \(D\)
5th \(D\) \(E\) \(E\) \(D\) \(B\)

Are you able to go backwards?

The Plurality Method

When determining the outcome of an election, the simplest and the most commonly used method is the plurality method. A plurality winner is the candidate who gets the most first-place votes. This method only cares about the first-place votes.

Example of using the plurality method For the previous example, we can ignore the the 2nd, 3rd, and 4th rows so we end up with 14 votes out of 37 votes for candidate A, 4 votes for candidate B, 10 + 1 or 11 votes for candidate C, and 8 votes for candidate D. That is, A gets 14/37 or 37.84% of all votes, B gets 10.81%, 29.73%, and D gets 21.62%. According to the plurality method, A is the winner.

One drawback of the plurality method is that a candidate might become the winner with less than 50% of the votes (or without having the majority of the votes). That is, a winner may have a plurality without having a majority. Another drawback is that a candidate that is the best on a head-to-head basis may not be a plurality winner (book page 12). This is obviously unfair. A candidate preferred by a majority of the voters over every other candidate when the candidates are compared in head-to-head comparisons is called a Condorcet candidate. A Condorcet candidate may not exist in an election.

We will introduce a few fairness criteria later (textbook page 22).

Example of head-to-head comparison

Use the data in the video: (right click and choose “Open link in new tab”)

Does there exist a Condorcet candidate? Why or why not?

Solution.

Let’s make pairwise comparisons for the 5 candidates (F, H, O, R, and S). There are totally 10 comparisons. In general, when there are \(N\) candidates, there are \(N(N-1)/2\) pairwise comparisons.

  1. F vs H: 3 voters prefer F and (49+48) or 97 voters prefer H, so H beats F.

  2. F vs O: 49 voters prefer F and (48+3) or 51 voters prefer O, so O beats F.

  3. F vs R: (48+3) or 51 voters prefer F and 49 voters prefer R, so F beats R.

  4. F vs S: (49+3) or 52 voters prefer F and 48 voters prefer S, so F beats S.

  5. H vs O: all voters prefer H and no voter prefer O, so H beats O.

  6. H vs R: 51 voters prefer H and 49 voters prefer R and (48+3), so H beats R.

  7. H vs S: all voters prefer H and no voter prefer S, so H beats S.

  8. O vs R: (48+3) or 51 voters prefer O and 49 voters prefer R, so O beats R.

  9. O vs S: 49 voters prefer O and (48+3) or 51 voters prefer S, so S beats O.

  10. R vs S: 49 voters prefer R and (48+3) or 51 voters prefer S, so S beats R.

In summary, F wins twice, H wins 4 times, O wins twice, R wins 0 time, and S wins twice. Only H beats all other candidates, so H is the Condorcet candidate.

The Borda Count Method

This is the second most commonly used method for determining the winner of an election. The method assigns points to each place on a ballot. Specifically, 1 point is assigned to the last place, 2 points to the second-from-last place, and so on. The points are tallied for each candidate separately, and the candidate with the highest total is the winner. Candidates can be ranked by the points they receive.

Example of Borda count (book page 13).

Watch the video: (right click and choose “Open link in new tab”)

Determine the winner using the Borda count method.

Solution.

When we tally the points,

A gets (4)(14) + (3)(0) + (2)(0) + (1)(23) = 56 + 0 + 0 + 23 = 79 points, since A gets 14 first-place votes, 0 second-place vote, 0 third-place vote, and (10+8+4+1) or 23 4th-place votes

B gets (4)(4) + (3)(24) + (2)(9) + (1)(0) = 16 + 72 + 18 + 0 = 106 points

C gets (4)(11) + (3)(8) + (2)(18) + (1)(0) = 44 + 24 + 36 + 0 = 104 points

D gets (4)(8) + (3)(5) + (2)(10) + (1)(14) = 32 + 15 + 20 + 14 = 81 points

By the Borda count method, B is the winner.

The results also give a ranking of the 4 candidates: B > C > D > A.

If the plurality method is used, the winner is A, as A receives 14 first-place votes, while B receives 4 first-place votes, C receives (10 + 1) or 11 first-place votes, and D receives 8 first-place votes. This method gives a ranking of the 4 candidates: A > C > D > B. A huge difference between the two methods!

The assignment of points for different places may be custom. For example, instead of assigning 4 points to each first-place vote, 4.5 points may be assigned. The Example 1.12 on page 14 of the textbook is worth reading. Note that the data is not organized in the regular preference schedule, but in a 2-way table with rows being candidates, columns being places, and the numbers of votes for a candidate at different places being in the body of the table. The standard preference schedule can always converted to such a 2-way table.

The Plurality-with-Elimination Method

This section is skipped!

The Method of Pairwise Comparisons

When making a pairwise comparison between two candidates, we give 1 point to the winner and 0 point to the loser. If a comparison ends up in a tie, each candidate is given 0.5 point. Candidates can be ranked based on points they receive.

The method of pairwise comparison is very much like a round-robin tournament: every player plays every other player once; the winner of each match gets 1 point and the loser gets no point (if there is a tie, each player gets 0.5 point).

The total number of comparisons equals \(N(N-1)/2\), where \(N\) is the number of candidates.

This video shows an example regarding pairwise comparison(right click and choose “Open link in new tab”).

This method always choose the Condorcet candidate (when there is one) as the winner of the election.

Fairness Criteria

The American economist Kenneth Arrow that all voting methods have limitations. This is known as Arrow’s Impossibility Theorem. Arrow set forth a minimum set of requirements for a voting method to be fair:

  • The majority criterion. If there is a majority candidate(i.e., a candidate with a majority of the first place votes), then that candidate should be the winner of the election.

  • The Condorcet criterion. If there is a majority candidate(i.e., a candidate who beats all other candidates in a pairwise comparison), then that candidate should be the winner of the election.

  • The monotonicity criterion. If candidate X is the winner of an election, then X should still be the winner had a voter ranked X higher in his preference ballot.

  • The independence-of-irrelevant-alternatives (IIA) criterion. If candidate X is the winner of an election, then X should still be the winner had one or more of the irrelevant alternatives (i.e., losing candidates) not been in the race. (In other words, the winner should not be hurt by the elimination from the election of losing candidates.)

The above fairness criteria can be used as a benchmark by which we can measure any voting method.

This video shows an example regarding fairness criteria(right click and choose “Open link in new tab”).

Suggested Exercises in the Textbook

1, 3, 11, 21, 23, 41, 51

Chapter 2. The Mathematics of Power

Chapter one considered elections in which one voter has only one vote (one voter-one vote). In this chapter, we consider elections in which one voter has \(x\) votes, with \(x > 1\) (one voter-\(x\) votes). That is, in some elections, voters are given different amounts of influence or power in the outcome of the voting. Such voting is called weighted voting. Weighted voting is seen in

While Chapter one focused on elections involving three or more choices, this chapter only considers voting on two choices which can be recast as a yes-no vote, known as a motion.

An Introduction to Weighted Voting

Every weighted voting system is characterized by three elements:

  • The players. Persons or institutions (such as corporations, municipalities, districts, states, and countries)

    • The weights. The weight of a player is the number of votes the player can control.

    • The quota. It is the minimum number of votes required to pass a motion. It could be a simple majority of the votes or even unanimity (100% of the votes). To avoid anarchy or gridlock, the quota is required to be greater than the half of the total number of votes and no greater than the total number of votes. In the U.S. Senate (each state is equally represented by two seats), for example, it takes a simple majority to pass an ordinary law (if no majority, the vice president has a vote), but it takes a minimum of 60 votes to stop a filibuster, and it takes a minimum of two-thirds of the votes to override a presidential veto.

Some notation to be used in this chapter:

  • \(N\) = the number of players

    • \(P_1, P_2, \cdots, P_N\) represent the N players

    • \(w_1, w_2, \cdots, w_N\) represent the weights of the \(N\) players, with the order \(w_1>w_2>\cdots>w_N\)

    • \(V\) = the total number of votes, with \(V = w_1+w_2+\cdots+w_N\)

    • \(q\) = the quota

    • \([q: w_1, w_2, \cdots, w_N]\) represents a generic weighted voting system with \(N\) players

The following are a few examples.

Example 1. (Venture Capitalism) Four partners (\(P_1, P_2, P_3, \text{and} ~P_4\)) decide to start a new business venture. In order to raise the $200,000 venture capital needed for startup money, they issue 20 shares worth $10, 000 each. Suppose that \(P_1\) buys 8 shares, \(P_2\) buys 7 shares, \(P_3\) buys 3 shares, and \(P_4\) buys 2 shares, with the usual agreement that one share equals one vote in the partnership. Suppose also that the the bylaws say that two-thirds of the partnership votes are required to pass a motion.

  1. How many total votes are there?

  2. What is the weight of each partner?

  3. What is the quota?

  4. Describe the weighted voting system using notation.

Solution.

  1. \(V = 20\), which is given directly.

  2. \(w_1 = 8, w_2 = 7, w_3 = 3, w_4 = 2\)

  3. \(q = (20)\cdot \frac{2}{3}=13.33\), (always) rounded up to 14.

  4. The weighted voting system can be symbolized as [\(14: 8,7,3,2\)].

Example 2. (Various weighted voting systems) What is wrong with each of the following weighted voting systems?

  1. [\(19: 8, 7, 3, 2\)]

  2. [\(30: 10, 10, 10, 9\)]

  3. [\(11: 12, 5, 4\)]

  4. [\(12: 9, 5, 4, 2\)]

Solution.

  1. To pass a motion, it requires every player to show its support (for example, without support from P4, the total number of votes is at most 18); that is, all players are in fact equally important.

  2. Player 4 is completely marginalized, since the other three players determine the final outcome and its vote will never make a difference. Thus, a player may have many votes, but have little power.

  3. Whether the motion passes or not is fully determined by the first player. Play 1 is called a dictator. A player is a dictator if and only if the player’s weight is no smaller than the quota.

  4. Without the approval of player 1, a motion can never be passed. Player 1 has veto power, but not a dictator. Such a player plays the role of a “spoiler” - it cannot force a motion to pass but can force a motion to fail. A player with weight \(w\) has veto power if and only if \(w < q\) and \(V-w < q\). Does the system [\(18: 12, 8, 5\)] have any player that has veto power?

This video(right click and choose “Open link in new tab”) provides a nice introduction to weighted voting. After watching it, try problem 3 on page 60 of the textbook.

Banzhaf Power

There are two commonly used methods for characterizing the power (influence on the voting outcome) of each player. We first introduce the Banzhaf Power.

We need the following concepts:

  • Coalition. A coalition is any group of players voting the same way. For example, a coalition consisting of players \(P_2\), \(P_4\), and \(P_5\) can be written as the set {\(P_2, P_4, P_5\)}. A coalition consists of all players is called a grand coalition. Thus, a coalition can be treated as a non-empty set. The total number of coalitions (non-empty sets) equals \(2^N-1\), where \(N\) represents the total number of players. (Refer to the solution to problem 9 in Chapter 0)
    • Winning coalition. A coalition is a winning coalition if the coalition has enough weight to meet quota.
    • Critical player. A player is critical in a coalition if them leaving the coalition would change it from a winning coalition to a losing coalition.
    • Critical count. The critical count of a player is the number of winning coalitions for which the player is critical. The sum of all critical counts for all players is called the total critical count.
    • Banzhaf power index. The ratio of the critical count for a player divided by the total critical count is called the power index for the player. The collection of all the power indices for all players is called the Banzhaf power distribution.

Here is a nice video helping you learn the basic concepts: Video(right click and choose “Open link in new tab”).

Example. Consider the voting system [16: 7, 6, 3, 3, 2].

  1. How many players are there?

  2. What is the total number of votes?

  3. What is the weight of each player?

  4. Are any players dictators?

  5. Does any player have veto power?

  6. Are any dummies? A dummy is a player who has no power at all.

  7. List all possible coalitions.

  8. List all possible winning coalitions.

  9. Find the critical count for each player.

  10. Find the Banzhaf power distribution.

Solution.

  1. There are 5 players.

  2. There are \(7+6+3+3+2\) or 21 votes.

  3. The weights of the 5 players are 7, 6, 3, 3, and 2, respectively.

  4. No play has weight at least 16, so there is no dictator.

  5. Players 1 and 2 are both veto players, since without each of the two the remaining players cannot meet the quota.

  6. Player 5 is the only dummy, since with or without it the outcome will not change.

  7. There are \(2^5-1\) or 31 coalitions.

    • Five 1-player coalitions: {\(P_1\)}, {\(P_2\)}, {\(P_3\)}, {\(P_4\)}, {\(P_5\)}
    • Ten 2-player coalitions: {\(P_1, P_2\)}, {\(P_1, P_3\)}, {\(P_1, P_4\)}, {\(P_1, P_5\)}, {\(P_2, P_3\)}, {\(P_2, P_4\)},{\(P_2, P_5\)}, {\(P_3, P_4\)}, {\(P_3, P_5\)}, {\(P_4, P_5\)}
    • Ten 3-player coalitions: {\(P_1, P_2, P_3\)}, {\(P_1, P_2, P_4\)}, {\(P_1, P_2, P_5\)}, {\(P_1, P_3, P_4\)}, {\(P_1, P_3, P_5\)}, {\(P_1, P_4, P_5\)},{\(P_2, P_3, P_3\)}, {\(P_2, P_3, P_5\)}, {\(P_2, P_4, P_5\)}, {\(P_3, P_4, P_5\)}
    • Five 4-player coalitions: {\(P_1, P_2, P_3, P_4\)}, {\(P_1, P_2, P_3, P_5\)}, {\(P_1, P_2, P_4, P_5\)}, {\(P_1, P_3, P_4, P_5\)}, {\(P_2, P_3, P_4, P_5\)}
    • One 5-player coalitions: {\(P_1, P_2, P_3, P_4, P_5\)}
  8. Here, a winning coalition must have at least 3 players to meet the quota 16. The first two players must be in any winning coalition. So, the winning coalitions are: + {\(P_1, P_2, P_3\)}, with \(P_1\), \(P_2\), and \(P_3\) all being critical players + {\(P_1, P_2, P_4\)}, with \(P_1\), \(P_2\), and \(P_4\) all being critical players + {\(P_1, P_2, P_3, P_4\)}, with \(P_1\) and \(P_2\) being critical players + {\(P_1, P_2, P_3, P_5\)}, with \(P_1\), \(P_2\), and \(P_3\) being critical players + {\(P_1, P_2, P_4, P_5\)}, with \(P_1\), \(P_2\), and \(P_4\) being critical players + {\(P_1, P_2, P_3, P_4, P_5\)}, with \(P_1\) and \(P_2\) being critical players

  9. Player 1 is critical in all 6 of the winning coalitions, so the critical count of player 1 is 6. Player 2 is critical in all of the 6 winning coalitions, so the critical count of player 2 is 6. Player 3 is critical in 2 of the 6 winning coalitions, so the critical count of player 3 is 2. Player 4 is critical in 2 of the 6 winning coalitions, so the critical count of player 4 is 2. Player 5 is critical in 0 of the 6 winning coalitions, so the critical count of player 5 is 0.

  10. The total critical count is \(6+6+2+2+0\) or 16, so the Banzhaf power distribution is

\[6/16=0.375=37.5\%, 6/16=0.375=37.5\%, 2/16 = 0.125=12.5\%, 2/16 = 0.125=12.5\%, 0/16 = 0=0\%\]

Shapley-Shubik Power

The Shapley-Shubik Power is another method to characterize the power (influence on the voting outcome) of each player.

We need some understanding of subsets and permutations before we introduce the method.

Recall that a set is a collection of distinct and well-defined items (including numbers and creatures). Each item in a set is called an element.

The following are examples of sets:

  • {3, 1, 15, 28}, a set containing 4 elements

  • {34, 21, 8, 62, 91, 234, 88}, a set containing 7 elements

  • {H, T}, a set containing 2 elements

The following are not sets:

  • {33, 12, 29, 12, 40}, not a set since 12 is repeated thus not distinct

  • A group of good people, not a set since “good” is not well-defined.

A subset of a set is a collection of some of the items in the set. Let the set be {1, 2, 3}, then all subsets are obtained by doing the following:

  • One 0-element subsets: { }, a set that contains no element.

  • Three 1-element subsets: {1}, {2}, {3}

  • Three 2-element subsets: {1, 2}, {1, 3}, {2, 3}

  • One 3-element subset: {1, 2, 3}

There are 1 + 3 + 3 + 1 = 8 subsets in total.

In general, if a set has \(N\) elements, then it has \(2^N\) subsets. For example, {2, 0, 6, 8} has \(2^4\) or 16 subsets while {1, 2, 3, 4, 5, 6} has \(2^6\) or 64 subsets.

When arranging distinct items and count the number of ways of arrangement, we need to use the permutation concept. Let’s consider the problem “In how many ways can we arrange 3 people (Alice, Bob, Colin) in order in a line?” The following are all possible ways:

  • ABC

  • ACB

  • BAC

  • BCA

  • CAB

  • CBA

There are 6 ways in total. When we arrange 10 people in a line, this enumeration method is not efficient. We need so-called multiplicative counting principle:

If completing task one can be done in \(m\) ways and, for each way task one is done, completing task two can be done in \(n\) ways, then the two tasks can be done together in \(m\cdot n\) ways.

This principle can be extended to more than two different tasks.

Now, in how many ways can we arrange 10 people in a line? We can use the counting principle and think this way: Imagine there are 10 positions already in order. Now, we ask “In how many ways can we fill the first position? (task 1)” Obviously, the answer is 10. Once the first position is filled, we ask “In how many ways can we fill the second position? (task 2)” Obviously, the answer is 9, since only 9 people is left to arrange. Continue this process, and we know there is only one way to fill the 10th position. The total number of ways to arrange 10 people is (10)(9)(8)(7)(6)(5)(4)(3)(2)(1) or 3628800. To know why we multiply these numbers, watch this video: https://www.youtube.com/watch?v=be8YUJcEfq4

Many times, we do not need to calculate the exact value of the product of the first \(N\) positive integers. Instead, we denote it by \(N!\) and read it like “N factorial” or “factorial of N”. For example, \(3!=6\) and \(6!=720\). So, to arrange 6 distinct items in order, there are 720 ways. To arrange \(N\) distinct items in order, there are \(N!\) ways.

Quick quiz:

Find the value of \(\frac{2022!}{2021!}\).

Can we make 0! meaningful? Yes, it is the number of ways to arrange 0 item in order, and 0! should be 1, since we don’t need to do anything, which is a way.

You might be wondering whether (-1)! makes sense. This is equivalent to asking whether \(N!\) can be defined when \(N\) is a negative integer. Since this is irrelevant to this course, I encourage you to study it on you own.

We are in a position to study the Shapley-Shubik power.

To know the basic concepts, read textbook from page 50 to page 52 until Example 2.16. You can also watch this video(right click and choose “Open link in new tab”) to quickly get the ideas.

Additional examples are provided by this video(right click and choose “Open link in new tab”) and by this video(right click and choose “Open link in new tab”).

Suggested Problems from Texbook

1, 3, 5, 7, 12, 15, 29, 39, 41, 51, 57

Chapter 3. The Mathematics of Sharing

Let’s first look at an example.

Alice and Bob want to split a stash of candy, of which 200 are Snickers, 200 are Milky Ways, and the remaining are Reese’s. Bob values each Snickers at $0.01, each Milky Ways at $0.03, and each Reese’s at $0.02.

Alice is the divider of the candy. Based on her value system, she splits the candy in two halves of the same value. Bob chooses the half that contains 80 Snickers, 90 Milky Ways, and 120 Reese’s.

  1. What is the total value of all the 600 pieces of candy from Bob’s perspective?

  2. What is the value of this half to Bob?

  3. Is this a fair share for him? By “fair” we mean Bob thinks his share has no less than 50% of the total value of the candy.

  4. If Bob is the divider, to create fair shares, how many pieces of Reese’s should he put in one half if this half already contains 120 pieces of Snickers and 100 pieces of Milky Ways?

Here is the solution:

  1. To Bob, the total value of all candy equals

\[(\$0.01)(200)+(\$0.03)(200)+(\$0.02)(200)=\$12.00.\]

The value of Bob’s share equals \[(\$0.01)(10)+(\$0.03)(120)+(\$0.02)(110)=\$5.90.\]

  1. Since the value of Bob’s share is less than 50% of the total value based on Bob’s value system, this is not a fair share.

  2. Since the total value of all the candy is $12.00, this half should be worth $6.00 to Bob. Suppose that this half contains x Reese’s, then \[(\$0.01)(120)+(\$0.03)(100)+(\$0.02)(x) = \$6.00,\] which is equivalent to \[1.20 + 3.00 + 0.02x = 6.00.\] Solving this equation yields \(x = 90.\)

3.1 Fair-Division Games

A fair division is a problem of dividing assets among \(N\) stakeholders, such that in each stakeholder’s opinion, he or she has received a share that is at least \(\frac{1}{N}\) of the value of the assets. \(\frac{1}{N}\) is the threshold for a fair share. Fair-division problems can be thought of in terms of games with players, goals, rules, and strategies.

Each share is called a fair share from the share holder’s perspective. A share that is fair to one player does not have to be fair to another player.

The basic elements of every fair-division game are as follows:

  • The assets. These are the things (tangible or intangible) being divided. Some assets may have negative values. Such assets can be chores, obligations, liabilities, etc. Denote the set of all assets by \(S\).

  • The players. They are the parties that divide the assets.

  • The value systems. Each player has the ability to assign a value to any part of the assets. Different players may have different value systems.

  • A fair-division method. These are the rules that govern the way the game is played. The method must be very specific and leave no room for ambiguity.

The following assumptions about players are made:

  • Rationality. Each player is a thinking, rational entity seeking to maximize their share of the assets.

  • Cooperation. Each player is a willing participant and accepts the rules of the game as binding.

  • Privacy. Each play has no useful information on other players’ value systems and thus has no idea what kinds of moves they are going to make in the game.

  • Symmetry. Players have equal rights in sharing the assets, and thus each player is entitled to a proportional share of the assets.

Example 1.

Mark, Tim, Maia, and Kelly are dividing among themselves a set of common assets equally owned by the four of them. The assets are divided into four shares \(s1, s2, s3\) and \(s4\). Table below shows the values of the shares to each player expressed as a percent of the total value of the assets.

  1. Which of the shares are fair shares to Mark?
  2. Which of the shares are fair shares to Tim?
  3. Which of the shares are fair shares to Maia?
  4. Find all possible fair divisions of the assets using \(s1\), \(s2\), \(s3\), and \(s4\).

Solution.

With four players, the threshold for a fair share is \(\frac{1}{4}\) or 25%.

  1. To Mark, \(s2\) and \(s3\) are both fair shares, since in Mark’s opinion, the values of the four shares are, respectively, 20%, 32%, 28%, and 20%, with the second and third shares larger than the threshold.

  2. To Tim, \(s1\), \(s2\), \(s3\), and \(s4\) are all fair shares, since in Tim’s opinion, the values of the four shares are, respectively, 25%, 25%, 25%, and 25%, with all shares at least equal to the threshold.

  3. To Maia, \(s3\) and \(s4\) are both fair shares.

  4. There are two possible fair divisions of the assets:

    1. Mark gets \(s2\); Tim gets \(s1\); Maia gets \(s3\); Kelly gets \(s4\).
    1. Mark gets \(s3\); Tim gets \(s1\); Maia gets \(s4\); Kelly gets \(s2\).

Example 2.

Alice, Bob, and Carlos are dividing among themselves the family farm equally owned by the three of them. The property is divided into three shares (s1, s2, and s3). The table below shows the values of the shares to each player expressed as a percent of the total value of the property.

Share
s1 s2 s3
Alice 38% 28% 34%
Bob 100/3% 100/3% 100/3%
Carlos 34% 40% 26%
  1. What is the threshold for a fair share?

  2. Which of the shares are fair shares to Alice?

  3. Which of the shares are fair shares to Bob?

  4. Which of the shares are fair shares to Carlos?

  5. Find all possible fair divisions of the assets using s1, s2, and s3 as shares.

  6. Of the fair divisions found in (e), is there an optimal division? Why or why not?

Solution.

  1. The threshold for a fair share is \(\frac{1}{3}\), since there are 3 players.

  2. The fair shares for Alice are s1 and s3, since, based on Alice’s value system, the values of s1 and s3 are both greater than the threshold.

  3. The fair shares for Bob are s1, s2, and s3, since, based on Bob’s value system, none of the values of s1, s2, and s3 is below the threshold.

  4. The fair shares for Carlos are s1 and s2, since, based on Carlos’ value system, the values of s1 and s2 are both greater than the threshold.

  5. To achieve a fair divisions of the assets, there are 3 possible fair divisions: (1) Alice should get s1, Bob should get s3, and Carlos should get s2, or (2) Alice should get s3, Bob should get s1, and Carlos should get s2, or (3111) Alice should get s3, Bob should get s2, and Carlos should get s1.

  6. Of the fair divisions found in (e), “Alice should get s1, Bob should get s3, and Carlos should get s2” is the optimal division, since each player maximize their share of the assets.

Example 3.

Angie, Bev, Ceci, and Dina are dividing among themselves a set of common assets equally owned by the four of them. The assets are divided into four shares (s1, s2, s3, and s4). The table below shows the values of the shares to each player expressed as a percent of the total value of the assets.

Share
s1 s2 s3 s4
Angie 22% 26% 28% 24%
Bev 25% 26% 22% 27%
Ceci 20% 30% 27% 23%
Dina 25% 25% 25% 25%
  1. Which of the shares are fair shares to Angie?

  2. Which of the shares are fair shares to Bev?

  3. Which of the shares are fair shares to Ceci?

  4. Which of the shares are fair shares to Dina?

  5. Find all possible fair divisions of the assets using s1, s2, s3, and s4 as shares.

  6. Of the fair divisions found in (e), is there an optimal division? Why or why not?

Solution.

  1. s2, s3

  2. s1, s2, s4

  3. s2, s3

  4. s1, s2, s3, s4

  5. All possible fair divisions are:

    1. Angie gets s2; Bev gets s1; Ceci gets s3; Dina gets s4.
    2. Angie gets s2; Bev gets s4; Ceci gets s3; Dina gets s1.
    3. Angie gets s3; Bev gets s1; Ceci gets s2; Dina gets s4.
    4. Angie gets s3; Bev gets s4; Ceci gets s2; Dina gets s1.
  6. No optimal division, since in fair division 1 and 2, Angie does not maximize their share, and in fair division 3 and 4, Ceci does not.

The Divider-Chooser Method

We introduce the divider-chooser method for creating fair divisions for \(N\) players (\(N\ge 2\)). According to this method, one player, called the divider, divides the assets into \(N\) shares, and other players pick the shares they want by some rules, leaving the last share to the divider. The assets must be divisible and thus called continuous assets. Examples of such assets are lands, cakes, and pizzas.

Two Players

Let’s say Tom and Jerry jointly own a cheesecake (textbook page 73). How might they divide the cake? The basic idea behind the divider-chooser method is that when the cake is divided between the two, the same person should not be both divider and chooser. While the cake is divided into shares of the same value from the divider’s perspective, the chooser can always choose the share that is at least 50% of the total value of the whole cake at the chooser’s angle. It is always better to be the chooser than the divider, so it is wise to decide who is the divider by means of a coin toss.

Example 4.

Martha and Nick jointly bought the giant 28-in. sub sandwich shown in Fig. below for $9. They plan to divide the sandwich fairly using the divider-chooser method. Martha likes ham subs twice as much as she likes turkey subs, and she likes turkey and roast beef subs the same. Nick likes roast beef subs twice as much as he likes ham subs, and he likes ham and turkey subs the same. Assume that Nick and Martha just met and know nothing of each other’s likes and dislikes. Assume also that when the sandwich is cut, the cut is made perpendicular to the length of the sandwich. (You can describe different shares of the sandwich using the ruler and interval notation. For example, [0,8] describes the ham part, [8,12] describes one-third of the turkey part, etc.).

Sandwich

Suppose that they flip a coin and Martha ends up being the divider.

  1. Describe how Martha would cut the sandwich into two shares s1 and s2.

  2. After Martha cuts, Nick gets to choose. Specify which of the two shares Nick should choose, and give the value of the share to Nick.

Solution.

Let each inch of turkey be \(\$x\) in Martha’s eyes. Since Martha likes ham subs twice as much as she likes turkey subs, and she likes turkey and roast beef subs the same, then to Martha, each inch of ham is \(\$2x\) and each inch of roast beef is \(\$x\). Since there are 8 inches of ham, the total value of ham is \(\$(2x)(8)\) or \(\$16x\). Since there are 12 inches of turkey, the total value of turkey is \(\$(x)(12)\) or \(\$12x\). Since there are 8 inches of roast beef, the total value of roast beef is \(\$(x)(8)\) or \(\$8x\). The total value of the sanwich to Martha should be

\[\$ (16x + 12x + 8x)\] or \(\$36x\), which equals $9. Solving the equation \(36x=9\) gives \(x=\frac{1}{4}\).

  1. Now, we know that, to Martha, each inch of ham is $0.50 and each inch of turkey or roast beef is $0.25.

Next, the 8-inch ham, to Martha, is $4.00. To use the Divider-Chooser method, Martha needs to include all ham and part of the turkey into one share. Since the total value of the sandwich is $9 and the ham part is $4.00, the part of the turkey must be worth $0.50 which means that turkey part must be 2 inches. Therefore, Martha should cut the sandwich at the point marked as 10. That is, the two shares are s1: [0, 10]; s2: [10, 28]

  1. By the Divider-Chooser method, Nick gets to choose. To Nick, the value of each inch of different subs differ. Let each inch of ham be \(\$x\) to Nick. Since Nick likes roast beef subs twice as much as he likes ham subs, and he likes ham and turkey subs the same, to him, each inch of turkey is \(\$x\) and each inch of roast beef is \(\$2x\). Just like for Martha, we need to solve the following equation for \(x\):

\[8x+12x+(8)(2x)=9\] The equation is simplified to \(36x=9\) and \(x=\$0.25\).

To Nick, the first share is worth \(8(0.25)+2(0.25)=\$2.5\) and the second share is worth $6.50, so Nick would choose the second share (s2) that is worth $6.50 to him.

The Lone-Divider Method

The Divider-Chooser Method can be extended to three or more players.

When there are three players involved in a continuous fair-division, one of them plays the role of the divider while the other two play the role of choosers. The procedure is this:

Step 0 (Preliminaries). One of the three will be the divider; the other two will be choosers. Since it is better to be a chooser, the decision of who is what is made by a random draw (such as rolling dice). Call the divider D and the choosers C1 and C2.

Step 1 (Division). The divider divides the cake into three shares(s1, s2, and s3). D will get one of these shares, but at this point does not know which one. Not knowing which share (privacy assumption) forces D to divide the cake into three shares of equal value (rationality assumption) in his or her opinion.

Step 2 (Bidding). C1 declares (usually by writing on a slip of paper) which of the three pieces are fair shares to her. Independently, C2 does the same. These are the bids. A chooser’s bid must list every single piece that she or he considers to be a fair share (i.e., worth one-third or more of the cake).

Step 3 (Distribution). Who gets which piece? The answer depends on which pieces are listed in the bids. Separate the pieces into C-pieces (these are pieces chosen by at least one chooser) and U-pieces (these are unwanted pieces that did not appear in either of the bids). A C-piece is a piece that at least one of the two choosers values at one-third or more, while a U-piece is a piece that both choosers value at less than one-third. Depending on the number of C-pieces, there are two cases to consider. Case 1. When there are two or more C-pieces, consider different combinations to find all possible fair divisions based on the bids and choose one (maybe the optimal one if it exists).

Case 2. When there is only one C-piece, the solution is to give one of the two U-pieces to the divider and he does not care which one to get since both U-pieces are equally fair to the divider. Now, combine the other U-piece and the C-piece into one piece and call it the B-piece which is worth more than two-thirds of the cake from the perspective of either chooser (why?). Now, the problem becomes a two-player problem.

Example 5.

Alice, Bob, and Carlos are dividing a cake using the Lone-Divider Method. They draw cards from a well shuffled deck of cards, and Bob draws the low card (bad luck!) and has to be the divider.

Step 1 (Division). Bob divides the cake into three shares (s1, s2, and s3). The table below shows the values of the shares in the eyes of each of the three players. The values are expressed as a percent of the total value of the cake.

Share
s1 s2 s3
Alice 38% 28% 34%
Bob 100/3% 100/3% 100/3%
Carlos 34% 40% 26%

Step 2 (Bidding). From the table shown, we can assume that Alice’s bid list is {s2, s3}, since both shares are at least one-third of the total value; Carlos’ bid list is {s1, s2}, since both shares are at least one-third of the total value.

Step 3 (Distribution). The C-pieces are s1, s2, and s3, since at least one choosers would choose each. The optimal distribution is Alice gets s1, Carlos gets s2, and Bob gets s3.

The Method of Sealed Bids

A fair division game for indivisible items is called a discrete fair division. Examples of indivisible items include houses, cars, boats, jewelry, and paintings. The sealed bids provides one method for discrete fair division. The method has 5 steps:

Step 1: Bidding. Each of the players submits their bid in a sealed envelope to make a bid (in dollars) for each of the items in the estate, giving their honest assessment of the actual value of each item. No player should be privy to another player’s bids before making their own. Once the bids are unsealed, the fair-share value for each player can be calculated by adding the player’s bids for all of the items in the estate to get the total value and then dividing this total by the number of players. Make a table with columns being players and rows being items followed by two additional rows “Total value” and “Fair-share value” in this order.

Step 2: Allocation. Each item will go to the highest bidder for that item (flip a coin when there is a tie). On the table just constructed, circle the value of the items going to the corresponding players.

Step 3: Initial Settlement. Subtract each player’s fair-share value from the total value of the items a player gets (It is possible that a play may get nothing). If this difference is positive, the player must pay this amount to the estate while getting items the player is entitled to. If the difference is negative, the player receives this amount of cash from the estate while getting items the player is entitled to. Make a new table with players as columns and with rows named “Item(s) received”, “Value received”, “Fair-share value”, and “To (from) estate” in this order. The numbers in the row “To (from) estate” equals the numbers in the row “Value received” minus the corresponding numbers in the row “Fair-share value”. The textbook uses a number in parentheses to denote a negative value. Add all the numbers in the row “To (from) estate” to get the so-called surplus (i.e., leftover) that belongs to the estate.

Step 4: Division of the Surplus. Divide the surplus equally among the players. Add an additional row called “Share of surplus” to the second table, so that each number in the new row equals the surplus divided by the number of players.

Step 5: Final Settlement. Add an additional row called “Final settlement” to the second table, so that the new row equals the row “To (from) estate” minus the share of surplus for each player.

Example 4.

Al and Betty are getting a divorce. The only property of joint value is their house. Rather than hiring attorneys and going to court to figure out how to split up the house, they agree to give the sealed bids method a try.

Al’s bid on the house is $340,000 and Betty’s bid is $ 364,000. The following is the first table constructed based the steps outline before.

Al Betty
House $340,000 $364,000
Total $340,000 $364,000
Fair-share value $170,000 $182,000

The second table is

Al Betty
Item(s) received None House
Value received $0 $364,000
Fair-share value $170,000 $182,000
To (from) estate ($170,000) $182000
Share of surplus $6,000 $6,000
Final settlement ($176,000) $176,000

where a number with parentheses in the row “Final settlement” indicates that the designated amount of money should be received by the corresponding player from the estate, while a number without parentheses in the row “Final settlement” indicates that the designated amount of money should be paid to the estate.

The final settlement: Betty gets the house but has to pay the estate $176,000; Al gets cash of $176,000.

For examples involving multiple indivisible items and more than 2 players, watch the video (Right click to open the link in new tab).

Apps

The instructor made an app which is available here: https://scsu.shinyapps.io/SealedBids/. The app can also be used for reverse auctions (such as selling a service or job to bidders with lowest bidder getting the service) and negative-valued items such as chores and other unpleasant responsibilities. To use the app, just type prices as negative values. Use the app to do Example 3.12 on page 83 of the textbook and Exercise 49.

Suggested Exercises from Textbook

Questions 1, 3, 5, 17, and

Refer to Example 1 of this lecture note. Suppose that they flip a coin and Nick ends up being the divider.

  1. Describe how Nick would cut the sandwich into two shares s1 and s2.

  2. After Nick cuts, Martha gets to choose. Specify which of the two shares Martha should choose and give the value of the share to Martha.

Chapter 4. The Mathematics of Apportionment

The chapter problem #1:

A mom has 50 pieces of candy, which she is planning to divide among her five children. To do this fairly, she would give each child 50/5 or 10 pieces of candy. This is a simple division problem. But, the mom wants to teach her children about the value of work and about the relationship between work and reward. This leads her to the following idea:

She announces to the children that the candy is going to be divided at the end of the week in proportion to the amount of time each of them spends helping with the weekly kitchen chores. Now, the problem is turned into the so-called apportionment problem.

At the end of the week, the data are in:

Child Alan Betty Connie Doug Ellie Total
Minutes worked 150 78 173 204 295 900

Here is a quick solution:

Alan worked 150 minutes out of the total 900 minutes, so his fraction of time is \(\frac{150}{900}\) or \(\frac{1}{6}\). If you have difficulty in simplifying fractions, watch this video: https://www.youtube.com/watch?v=4CKDqvddhhg. Since there are 50 pieces of candy, so he should get \(50\cdot \frac{1}{6}\) or \(\frac{50}{6}\) or \(\frac{25}{3}\) or \(8\frac{1}{3}\) or 8.33. If you have difficulty in dealing with multiplication involving fractions and integers, watch this video: https://www.youtube.com/watch?v=2gFCEY9Hxas Similarly, Betty should get \(50\cdot \frac{78}{900}\) or \(\frac{13}{3}\) or \(4\frac{1}{3}\) or 4.33, Connie should get \(9\frac{11}{18}\) or 9.61, Doug should get \(11\frac{1}{3}\) or 11.33, and Ellie should get \(16\frac{7}{18} or 16.39\). It’s customary to round these numbers to two or three decimal places.

Since the amount of candy needs to an integer, you might round the above values to 8, 4, 10, 11, and 16, respectively. When adding these integers, we get 49, which is 1 less than 50, so it’s not a perfect apportionment. Should the left over be given to any child?

The chapter problem #2:

A republic consists of six states called A, B, C, D, E, and F, with populations given in the following table:

State A B C D E F Total
Population 164,000 6,936,000 154,000 2,091,000 685,000 988,000 12,500,000

There are 250 seats in the congress, which according to the laws of Parador, are to be apportioned among the six states in proportion to their respective populations.

What is the “correct” apportionment?

Here is a quick solution:

The number of seats each of the sic states should get is given as follows:

A: \(250\cdot \frac{164,000}{12,500,000}\) or 32.92

B: \(250\cdot \frac{6,936,000}{12,500,000}\) or 138.72

C: \(250\cdot \frac{154,000}{12,500,000}\) or 3.08

D: \(250\cdot \frac{2,091,000}{12,500,000}\) or 41.82

E: \(250\cdot \frac{685,000}{12,500,000}\) or 13.70

F: \(250\cdot \frac{988,000}{12,500,000}\) or 19.76

Since the number of seats is an integer, the above values are rounded to the nearest integers 33, 139, 3, 42, 14, 20, ending up a seat more than available. Again, not a perfect apportionment.

In real life, apportionment is important in many other situations, such as assigning nurses to shifts in a hospital and assigning learning assistants to sessions of a same course.

4.1 Apportionment Problems and Apportionment Methods

We will introduce a few methods for apportionment from now on. Since the gold standard for apportionment applications is the allocation of seats to “states” in a legislature, the basic elements of every apportionment problem are as follows:

  • The “states.” These are the parties having a stake in the apportionment. The states can be districts, counties, people, etc. We use \(N\) to denote the number of states.

  • The “seats.” These are the \(M\) identical, indivisible objects that are being divided among \(N\) seats.

  • The “populations.” These are the \(N\) positive numbers that are used as the basis for the apportionment of the seats to the \(N\) states. If we use \(p_1, p_2, \cdots, p_N\) to denote the populations of the respective states, then the total population \(P\) is equal to \(p_1 + p_2 + \cdots + p_N\).

  • An apportionment method. This is the systematic procedure that guarantees a division of the \(M\) seats (no more and no less) to the \(N\) states using some formula based on the populations of the states.

Recall the chapter problem #2, we used \[250\cdot \frac{1646000}{12,500,000}\] to calculate the seats allocated to state A. Mathematically, this expression can also be written as (check it!)

\[\frac{1,646,000}{(\frac{12,500,000}{250})}\] The part in the parentheses is the number of people represented by each seat in the congress. This number (without rounding) is called the standard divisor, denoted by (\(SD\)), which is the ratio of total population to seats, or the average number of people per seat. It gives us a unit of measurement: 1 seat = \(SD\) people. It will be used for apportionment calculation.

The number of seats that a state of population p would get can be calculated as \(\frac{p}{SD}\), and we denote it by \(q\); that is, \(q=\frac{p}{SD}\). This number represents the exact fractional number of seats and is called the standard quota of the state.

Since the standard quota of a state is usually not an integer, it may be rounded down or up to get the lower quota or upper quota. Use letters \(L\) or \(U\) to denote them.

For the second problems we introduced before, the standard divisor is \(SD=\frac{12,500,000}{250}\) or 50,000. The standard quota for state A is 32.92, and the nearest integer to it is 33. The lower and upper quota are 32 and 33, respectively. Similar calculations can be done and are summarized in the following table:

State A B C D E F Total
Standard quota 32.92 138.72 3.08 41.82 13.7 19.76 250
Nearest integer 33.00 139.00 3.00 42.00 14.0 20.00 251
Lower quota 32.00 138.00 3.00 41.00 13.0 19.00 246
Upper quota 33.00 139.00 4.00 42.00 14.0 20.00 252

The allocation of seats according to the conventional rounding of the standard quotas is not really an apportionment method. In next a few sections, we will introduce a few methods that are real apportionment methods (the sum of individual seats add up to the total seats).

Hamilton’s Method

Alexander Hamilton’s method has three steps:

Step 1. Calculate each state’s standard quota.

Step 2. Round the standard quotas down and give to each state its lower quota.

Step 3. Give the surplus seats (one at a time) to the states with the largest residues (fractional parts) until there are no more surplus seats.

Let’s continue the Parador’s congress apportionment problem.

State A B C D E F Total
Standard quota 32.92 138.72 3.08 41.82 13.7 19.76 250
Lower quota 32 138 3 41 13 19 246
Residue 0.92 0.72 0.08 0.82 0.7 0.76 4
Order of surplus First Fourth Second Third
Apportionment 33 139 4 42 13 20 250

A nice video: https://www.youtube.com/watch?v=xyz

A nice summary is here: https://www.census.gov/history/www/reference/apportionment/methods_of_apportionment.html#:~:text=The%20Hamilton%2FVinton%20Method%20sets,result%20in%20surplus%20house%20seats.

A major flaw of the Hamilton’s method is that the reallocation of surplus seats is biased in favor of large states since larger states tend to have larger residues (residues behave like a measurement error).

The Bama Paradox: Textbook shows an example on pages 108-109 where Hamilton’s method runs into trouble. When the number of seats to be apportioned is increased by one and nothing else changes, the apportioned seats to a state may decrease by at least one. Read it!

Jefferson’s Method (the Method of Greatest Divisor)

Thomas Jefferson proposed his method in 1792 to apportion the U.S. House of Representatives. His method has the same first two steps, except the standard divisor is replaced by a slightly smaller divisor gradually, making the standard quotas bigger for all states, until the surplus seats are all gone. So, this is a trial and error method.

Adams’s and Webster’s Methods

The Huntington-Hill Method

The Quota Ruleand Apportionment Paradoxes

Chapter 5. The Mathematics of Getting Around

On a normal day a single UPS (United Parcel Service) driver delivers somewhere between 200 (in rural areas) and 500 (in urban areas) packages. Efficient routing is a key to success in delivering all the packages. How can a route be designed to minimize the total amount of “wasted” travel?

Trying to minimizing the total length of a route is known as a street-routing problem. Such problem has applications to other types of situations such as routing garbage trucks and tourist buses.

Street-Routing Problems

A routing problem is concerned with finding ways to route the delivery of goods or services (such as pizzas or bus service) to an assortment of destinations (such as homes or warehouses). When dealing with a routing problem, we are interested in two questions:

  • (Existence) Is an actual route possible? If yes,

  • (Optimization) Of all the possible routes, which one is the optimal route? Here, optimization can be measured against cost, distance, or time.

Example 1. (The Konigsberg Bridges Problem)

The figure below shows the layout of the town of Konigsberg in the 1700s, with a river running through town, the two islands on the river, and the north and south banks all connected by the seven bridges shown in yellow.

Konigsberg

A simplified figure and a graph model are shown below:

Konigsberg Can a person take a walk around town fully crossing every bridge once and only once? The 18th century Swiss mathematician Leonard Euler proved that there is no route that satisfies the requirements.

Example 2. (The Bridges of Madison County)

The Madison River runs through Madison County, and there are four islands (A, B, C, and D) and 11 bridges joining the islands to both banks of the river (R and L) and one another.

Madison

Can you draw a graph like (b) above? Is there any route that fully crosses every bridge once and only once?

An Introduction to Graphs

A graph (not the same as the graph of a function) is a picture that has two basic elements: a set of “dots” called the vertices and a collection of “lines” called the edges.

When two vertices are connected by an edge, we say that they are adjacent vertices. When a vertex is connected with itself, the edge is called a loop.

Example 3.

Look at the following graph:

GraphWithLoops

The vertex set of the graph is \(\{a, b, c, d, e, f\}\). The edges are ab, ae, ae, bc, bd, cc, ce, de, de, and de.

The edge cc is a loop. Since vertex f is not connected with any other vertex, it is called an isolated vertex.

The graph is said to be disconnected (at least two pieces), while the graph in the Konigsberg bridges example is said to be a connected graph (a single piece).

Graphs without loops and without multiple edges (more than one edge connecting two vertices) are called simple graphs.

We may be interested in a path from one vertex to another vertex. By path we mean a sequence of distinct edges each adjacent to the next. The number of edges in a path is called the length of the path. In the graph just shown, the edges ab, bd, and de form a path of length 3, and this path can be described by listing the vertices in order like a, b, d, e. With this notation, a, e, d, b, c, c is a path of length 5 (This “trip” is possible because of the loop at c).

When a trip along the edges of the graph closes back on itself we call it a circuit rather than a path. A path is an open-ended trip while a circuit is a closed trip. In the above graph, a, b, d, e, a is a circuit of length 4, and a, e, a is a circuit of length 2, while c, c is a circuit of length 1.

Look at another graph below:

Cliques This graph has 10 vertices and many edges. The vertices A, B, C, D, and E are all connected to each other, and thus they form what is known as a clique. The vertices F, G, and H form another clique.

The number of edges that meet at a vertex (say X) is called the degree of the vertex, and is denoted by \(deg(X)\). In the above graph, \(deg(A)=5\) and \(deg(G) = 2\). The parity of the degree of a vertex is very important when some theorems regarding graphs are later introduced. Here, vertex A has an odd degree while vertex G has an even degree. When a graph has loops, each loop contributes 2 to the degree of the vertex to which the loop is tied). For example, the degree of the vertex of c in Example 3 is 4, not 3.

An Euler path of a graph is a path that covers all the edges of the graph. An Euler circuit of a graph is a circuit that covers all the edges of the graph.

An edge in a connected graph is called a bridge (or cut edge), if the graph is disconnect when the edge is removed.

Example 4.

In the graph below,

Bridge The vertices A, B, C, D, E form an Euler path, while vertices D, C, B, E, and D form an Euler circuit. There are two bridges: AB and DF.

We have introduced the basic concepts in graph theory. We can use graphs to model real problems.

Euler’s Theorems and Fleury’s Algorithm

We will introduce the basic theory that allows us to determine if a graph has an Euler circuit, an Euler path, or neither. This is important, because Euler path or Euler circuit questions are the models for real-life street routing questions.

Euler’s Circuit Theorem

  • If a graph is connected and every vertex is even (having an even number of degree), then it has at least one Euler circuit.

  • If a graph has any odd vertex, then it can NOT have an Euler circuit.

Euler’s Path Theorem

  • If a graph is connected and has exactly two odd vertices, then it has at least one Euler path. Any such path must start at one of the odd vertices and end at the other one.

  • If a graph has more than two odd vertices, then it can NOT have an Euler path.

Note that the two theorems only deal with connected graphs. For any connected graph, there are four cases to consider:

  • Case 1. The graph has no odd vertex.

  • Case 2. The graph has exactly one odd vertex.

  • Case 3. The graph has exactly two odd vertices.

  • Case 4. The graph has more than two odd vertices.

Under case 1, by the theorems, the graph has at least one Euler circuit and thus at least one Euler path.

Under case 2, neither theorem gives an answer; in fact, there is no such graph with exactly one odd vertex, since every edge contributes two degrees to the total degree of all vertices and thus the total degree of a graph must equal twice the total number of edges.

Under case 3, by the theorems, the graph has at least one Euler path, but has no Euler circuit.

Under case 4, by the theorems, the graph has neither an Euler path or an Euler circuit.

Fortunately, the Case 2 is answered by the following theorem:

Euler’s Sum of Degrees Theorem

  • The sum of the degrees of all the vertices of a graph equals twice the number of edges (and, therefore, is an even number).

  • A graph always has an even number of odd vertices.

The theorem implies that Case 2 is unlikely, which means no connected graph has exactly one odd vertex.

The above results are summarized in the following table:

Does a Connected Graph Have Euler Circuit or Euler Path?
No of Odd Vertices Euler Circuit? Euler Path?
0 Yes Yes
1 No such case No such case
2 No Yes
3 or more No No

Before we look at an example, watch this video to consolidate your understanding of the Euler’s Theorems: https://www.youtube.com/watch?v=teVOACnswkY

Example 5. (The Konigsberg Bridge Problem Solved)

The graph is reproduced here. Every edge in graph (b) represents a bridge. We want to answer the question “Can a person take a walk around town fully crossing every bridge once and only once?”

Konigsberg

Since the graph has four odd vertices (Check it!), by the two theorems (second parts in both), neither an Euler circuit nor an Euler path can exist.

Fleury’s Algorithm

This algorithm will help us find either an Euler circuit (no odd vertex) or an Euler path (Only two odd vertices) in a connected graph. The algorithm is very easy to follow. The key is to deal with bridges (also known as cur edges).

Two points to keep in mind when finding either an Euler circuit or Euler path:

  • The algorithm starts with choosing a tarting vertex. For finding an Euler circuit, start with any vertex. For finding an Euler path, start with any one of the two odd vertices.

  • At each future step, if you have any choice to choose an edge, don’t choose a bridge unless you have to.

Do it yourself.

Use the following graph to find an Euler circuit:

FleuryAlgorithm

Then watch this video: https://www.youtube.com/watch?v=F4BM6fnLl04.

Another good video: https://www.youtube.com/watch?v=DEqFYHRrqRY

Eulerizing and Semi-Eulerizing Graphs

We will use the Euler Theorems to help us design optimal street routes for graphs with more than two odd vertices. by the first Euler theorem, these graphs have no Euler circuit. To design optimal street routes, the key idea is to turn odd vertices into even vertices by adding duplicate (not new edges!) edges in strategic places. This is called an eulerization of the original graph.

To eulerize a graph without any Euler circuit, we must first find odd vertices and add duplicate edges associated with those vertices in a way to minimize the total distance, time or cost walking through the route. Once eulerization is done, we can use the Fleury algorithm to find an Euler circuit based on the eulerized graph.

When a path is of interest, we can add duplicate edges to the graph not already having any Euler path so that only two of the vertcies of the new graph are odd. This process is called a semi-eulerizrion and we say that the original graph has been semi-eulerized.

In summary:

when eulerizing or semi-eulering a graph, adding duplicate edges is allowed, while adding new edges is forbidden.

Read the pages from 158-156 of the textbook for examples.

Watch the video: https://www.youtube.com/watch?v=Z7MRbPrxjcQ. Question: In the first example, why doing eulerization but not semi-eulerization? Watch one more: https://www.youtube.com/watch?v=lUqCtywkskU

Suggested Exercises from the Textbook

1, 11, 19, 21, 22, 23, 35, 37, 41

Chapter 6. The Mathematics of Touring

Given a set of cities and the distance between each pair of them, how can we find the shortest possible route that visits every city exactly once and returns to the starting point? This travel forms the so-called Hamilton circuit.

6.1 What is a Traveling Salesman Problem?

The Traveling Salesman Problem (TSP) is a metaphor for many real-life problems that share a common mathematical structure.

There are three elements common to all TSPs:

  • A traveler. This could be a person, an animal, or a vehicle.

  • A set of sites. These are the places the traveler must visit.

  • A set of costs. These are the expenses of traveling from a site to another. The costs might be distance, time, or money.

A solution to a TSP is a “trip” that starts and ends at one site and visits all the other sites once and only once. Such a trip is called a tour. An optimal solution (may not be unique!) to a TSP is a tour of minimal total cost.

Example 1.

Tom is a traveling salesman. He is planning his next business trip, where he will visit customers in four cities (A, B, C, and D). City A is Tom’s hometown, so he starts and ends his trip at this city.

The graph shown below (not drawn to scale) is a model for the trip. It shows a set of cities and the distances between each pair.

What is the optimal (in terms of distance traveled) route that the salesman must follow to complete his tour?

TSP.Cities

One possible trip is A, B, C, D, A. Can you list all possible trips? Which is (are) the optimal one(s)?

We can construct a cost (distance here) chart for the TSP:

Cost Chart for a TSP
A B C D
A
10 15 20
B 10
35 25
C 15 35
30
D 20 25 30

which is symmetric across the diagonal.

Hamilton Paths and Circuits

In 1857, William Rowan Hamilton invented the Icosian board game. The purpose of the game is to find a trip along the edges of a graph similar to the one below:

IcosianGame

The trip should visit each of the vertices once and only once, returning at the end to the starting vertex, so the game is a TSP.

A Hamilton path in a connected graph is a path that visits each of the vertices once and only once.

A Hamilton circuit in a connected graph is a circuit that visits each of the vertices once and only once.

The graph for the Konigsberg bridges does have Hamilton circuits and paths. Choosing a vertex to start with, can you list as many as you can?

Hamilton paths or circuits require all the vertcies to be visited once and only once, while Euler paths or circuits require all the edges to be visited once and only once. The two sets of concepts are very different.

Watch the video: https://www.youtube.com/watch?v=AamHZhAmR7o for Hamiltonian circuits and Hamiltonian paths.

Example 1.

Given the following graph,

HamiltonCircuitPath

Find a Hamilton path and a Hamilton circuit. Does the graph have a Euler circuit or Euler path?

Solution.

Some Hamilton paths are: A, F, B, C, G, D, E and A, F, B, E, C, G, D.

Some Hamilton circuits are: A, F, B, C, G, D, E, A and A, F, B, E, C, G, D, A.

The graph does have a Euler circuit, since every vertex is even.

Example 2.

The following graph does not have a Hamilton circuit or path, since if it does, the circuit or path must start with either E or D and then C must be visited more than once.

NoHamilton

A simple graph (no loop or multiple edges) in which every vertex is connected to every other vertex is called a complete graph. Complete graphs afe a family of graphs that have Hamilton circuits and thus Hamilton Paths.

A complete graph with \(N\) vertices enjoy the following common properties:

  1. The degree of every vertex is \(N-1\).

  2. The number of edges is \(\frac{N(N-1)}{2}\).

  3. The number of Hamilton paths is \(N!\).

  4. The number of Hamilton circuits is \((N-1)!\).

The first property follows from the definition. The second follows from the Euler’s sum of degrees theorem, which says the total degree of all vertices (\(N(N-1)\)) is twice the number of edges. The third property can be obtained by using the multiplicative counting principle introduce before. Specifically, we label each vertex as 1, 2, 3, and so on. To find a Hamilton path, we can start with any of the \(N\) vertex, so there are \(N\) ways. Once the starting vertex is settled down, the second vertex can be chosen in \(N-1\) ways, since there are \(N-1\) vertices connecting to the starting vertex. The process continues until there is only one vertex is left and we have only one choice then. By the multiplicative counting principle, there are \(N\cdot (N-1)\cdot (N-2)\cdots 1\) or \(N!\) ways to create a Hamilton path, which is equivalent to saying there are \(N!\) Hamilton paths. A Hamilton circuit actually does not depend on the starting vertex, since it is a cycle. So, when we use the counting principle, the first task can be done in only one way! Our counting formula would then be \(1\cdot (N-1)\cdot (N-2)\cdots 1\).

Suggested Exercises from Textbook

4, 13

Chapter 7. The Mathematics of Networks

Skip

Chapter 8. The Mathematics of Scheduling

Consider the problem:

Apartment Unlimited is an apartment maintenance company that refurbishes apartments before new tenants move in. The following table shows the individual pieces of work (i.e., tasks) for refurbishing a one-bedroom apartment, their processing times (in hours), and their precedent tasks. Task \(A\) is said to be a precedent to task \(B\), if task \(B\) cannot be started before task \(A\) is completed.

Chart for a Maintenance Project
Tasks Label (Process time) Precedent Tasks
Bathrooms (clean) B(2) P
Carpets (Shampoo) C(1) S, W
Filters (Replace) F(0.5)
General cleaning G(2) B, F, K
Kitchen (clean) K(3) P
Lights (replace bulbs) L(0.5)
Paint P(6) L
Smoke detectors (battery) S(0.5) G
Windows (wash) W(1) G
  1. How should the tasks be scheduled (prioritize the tasks) in order to minimize the project finishing time (the number of hours needed to finish the project)?

  2. Who should do what and when?

  3. How can you build a graph model to package all the information (i.e., show all the precedence relations) between tasks?

8.1 An introduction to Scheduling

The four basic ingredients that make up any scheduling problem are described below.

  • The processors. These refer to the “workers” who carry out the work. A processor may be a human being, a robot, a computer, and so on.

  • The tasks. These refer to individual pieces of work that cannot be broken up into smaller units and thus cannot be carried out by different processors. At any particular moment, tasks are in one of the following four states:

    • Ineligible. The cannot be started because some of the prerequisites for the task have not yet been completed.
    • Ready. The task has not been started but could be started at this time.
    • In execution: The task is being carried out by one of the processors, or
    • Completed.
  • The processing times. The processing time (called \(P\)-time) of a task is the amount of time, without interruption, required by one processor to execute that task. Some assumptions are made to simplify things:

    • Versatility. Any processor can execute any task.
    • Uniformity. The processing time for a task is the same regardless of which processor is executing the task.
    • Persistence. Once a processor starts a task, it will complete it without interruption.
  • The precedence relations. Precedence relations are formal restrictions on the order in which the tasks can be executed much like those course prerequisites in the school catalog. In the case of tasks, these prerequisites are called precedence relations. A precedence relation can be abbreviated by writing \(X\rightarrow Y\), meaning that task \(X\) precedes (or is precedent to) task \(Y\). When there is no precedence relation between two tasks, the tasks are said to be independent. There are two final notes about precedence relations:

    • Precedence relations are transitive: If task \(X\) is precedent to task \(Y\) and task \(Y\) is precedent to task \(Z\), then task \(X\) must be precedent to task \(Y\).
    • Precedence relations cannot form a cycle. A cycle looks like \(X\rightarrow Y\rightarrow Z\rightarrow X\).

We will use notation such as \(X(5)\) to mean that the task called \(X\) has a processing time of 5 units. The finishing time of a project is denoted by \(fin\), and the finishing time of an optimal schedule is denoted by \(opt\).

Example 1.

Jennifer wrecked her car. The car is taken to a garage. The two workers (called \(P_1\) and \(P_2\)) do the repair. The repairs on the car can be broken into four different tasks:

  • exterior body: A(4)

  • engine repairs: B(5)

  • painting and exterior finish work: C(7)

  • transmission repair: D(3)

where the times are in hours.

The only precedence relation is \((A\rightarrow C)\). The two workers always work together on a repair project, but each takes on a different task without arguing with each other.

The following figures show some candidate schedules out of many more:

SchedulingTimeLine

Comment on each schedule.

Solution.

Figure (a) shows a legal schedule with \(Fin=12\) hours, figure (b) shows an illegal schedule that violates the precedence relation \((A\rightarrow C)\), figure (c) shows an optimal schedule with \(Opt = 11\) hours, and figure (d) shows an alternative optimal schedule.

Two useful lessons can be drawn from this example:

  • If there are many tasks or processors, the figure can be very complex, so a good scheduling algorithm is needed.

  • There is an absolute minimum time (called the critical time) that no schedule can break, no matter how good an algorithm we use and how many processors we put to work.

Directed Graphs

We need some basic concepts regarding directed graphs.

A directed graph, or digraph for short, is a graph in which the edges have a direction associated with them, typically indicated by an arrowhead.

Some examples of digraphs:

Digraphs are useful when we want to describe asymmetric relationships. Situations involving asymmetric relationships are:

  • Romantic love. That X loves Y does not imply that Y loves X.

  • Web linkage. There may be a hyperlink (or just link) on Web page X that allows you to jump directly to Web page Y, but no link on Web page Y that sends you to X.

  • Traffic flow. Some streets are one-way and others are two-way streets.

  • Basketball tournaments. One team wins against another team.

  • Organization charts. In any organization (a corporation, a university, the military, etc.), two individuals X and Y in a chain of command may mean that X is Y’s immediate boss.

  • Scheduling. Task X must be followed by task Y.

Rather than using edges and circuits, we use arcs and cycles in digraphs. Rather than talking about the degree of a vertex, we use the notions of indegree and outdegree. The indegree of a vertex is the number of arcs that have X as their ending vertex; the outdegree of a vertex is the number of arcs that have X as their starting vertex.

In the following digraph,

The indegree of vertex A is 1, but the outdegree is 3; the indegree of vertex D is 2, but the outdegree is 3. There are two paths from A to D: A, B, D and A, D. There are a two cycles: A, B, D, A and A, D, A.

There are some online tools that allows us to create graphs. Here is one: https://graphonline.ru/en/

Priority-List Scheduling

Every project is best described by its project digraph. It is the basic graph model used to package all the information in a scheduling problem. : The vertices in a project digraph are the tasks including two fictitious tasks (START and END), and the arcs are the precedence relations. The project digraph allows us to better visualize the execution of the project as a flow, moving from left to right.

A priority list is a list of all the tasks prioritized in the order we prefer to execute them. The number of possible priority lists in a project consisting of \(M\) tasks is \(M!\) or \(M\cdot (M-1)\cdot (M-2)\cdots \cdot 2 \cdot 1\).

For creating a schedule based on a priority list, just watch this video: https://www.youtube.com/watch?v=DKxG-t7VyCs&t=178s. It is about the first 3 examples in this section of the book.

Here is an app that does priority-list scheduling: https://media.pearsoncmg.com/cmg/pmmg_mml_shared/animations/author/tannenbaum/applets/interactives/digraphs/index.html

Chapter 9. Population Growth Models

The United States Population Clock is a digital display operated by the Census Bureau that shows the official resident population of the United States at the moment you are visiting this web page: https://www.census.gov/popclock/.

The ticking of the clock is based on a mathematical model for the growth of the U.S. population. Like all models, this model is not intended to be a perfect description of the reality but just an approximation.

Population models are used by economists to make economic forecasts, by epidemiologists to project and prepare for epidemics, by urban planners to plan the future of cities, demographers to develop population pyramids, by biologists to study animal populations, and so much more.

In this chapter we will look at three mathematical models of population growth or decline (i.e., negative growth). These models are: the linear growth model, the exponential growth model, and the logistic growth model.

9.1 Sequences and Population Sequences

Before we introduce each model, we need some mathematical preparation:

  • A sequence is an infinite, ordered list of numbers. The individual numbers in the sequence are called terms. For example, the list \(1, 3, 5, 7, 9, \cdots\) is a sequence representing all positive odd numbers, where the symbol \(\cdots\) means the pattern continues or is the equivalence of “and so on.” Another example of a sequence is the list \(1, 1, 2, 3, 5, 8, 13, 21, \cdots\) where, starting with the third term, each subsequent term is the sum of the immediate two previous terms. This sequence is called the Fibonacci sequence and the numbers in this sequence are called Fibonacci numbers.

  • In general, we use notation such as \(a_1, a_2, a_3, \cdots\) to represent a sequence. The terms are respectively called “the first term”, “the second term”, and so on. The notation may be different in different situations.

  • If the pattern of a sequence is clear, we can use an explicit formula for the \(n\)th term. For example, this sequence \(1, 3, 5, 7, 9, \cdots\) has a clear \(n\)th term of \(a_n=2n-1\). For the Fibonacci sequence \(1, 1, 2, 3, 5, 8, 13, 21, \cdots\), the \(n\)th term, denoted by \(F_n\), can be shown to equal

\[F_n = \frac{1}{\sqrt{5}}[(\frac{1+\sqrt{5}}{2})^n-(\frac{1-\sqrt{5}}{2})^n]\] * A sequence may be described by a recursive formula, which is to define a term using one or more previous terms. The Fibonacci sequence is in fact defined by \(F_n=F_{n-1}+F_{n-2}, ~n =3, 4, \cdots\) with \(F_1 = 1\) and \(F_2=1\). A recursive formula is not always possible.

  • We use \(P_0, P_1, P_2, \cdots\) to denote a population sequence, where \(P_0\) is the size of the initial population, \(P_1\) is the size of the population in the first generation, \(P_2\) is the size of the population in the second generation, and so on.

A convenient way to visualize a population sequence is with a time-series graph, which is a line graph with time on the \(x-\)axis and the population size on the \(y-\)axis. The following is an example graph.

Example 1.

Consider the sequence defined by the explicit formula \(a_n = 5n+10\).

  1. Find \(a_0\).

  2. Find \(a_8\).

  3. Suppose \(a_n=45\). Find \(n\).

Solution.

  1. \(a_0=5\cdot 0 + 10 = 10\).

  2. \(a_8=8\cdot 5 + 10 = 40+10=50\).

  3. We know \(a_n=45\). That is, \(5\cdot n + 10 = 45\). To solve this equation, subtract 10 from both sides to get the new equation \(5\dot n = 35\). Dividing both sides by 5 gives \(n=7\).

Example 2.

Consider the sequence defined by the explicit formula \(a_n = \frac{4n+52}{n+3}\).

  1. Find \(a_0\).

  2. Find \(a_5\).

  3. Find the value of \(n\) for which \(a_n=6\).

Solution.

  1. \(a_0= \frac{4\cdot 0+52}{0+3} = \frac{22}{3}=7\frac{1}{3}\).

  2. \(a_5=\frac{4\cdot 5+52}{5+3} = \frac{72}{8}=9\).

  3. We know \(a_n=6\). That is, \(\frac{4n+52}{n+3} = 6\). To solve this equation, we first multiply both sides by \((n+3)\) to remove the denominator. We now have the new equation \(4n +52 = 6(n+3)\). Expand the right side to get \(4n +52 = 6n + 18\). Subtracting \(4n\) from both sides to get a simplified equation \(52=2n+18\). Subtracting \(18\) from both sides to get \(34=2n\).Dividing both sides by 2 gives \(n=17\).

Example 3.

Consider the sequence defined by the recursive formula \(a_n = 2a_{n-1}+a_{n-2}\) and starting with the first two terms \(a_0 = 2\) and \(a_1 = 3\).

  1. Find the next 3 terms.

  2. Find \(a_5\).

Solution.

  1. By the recursive formula,

\[a_2 = 2a_1+a_0=2\cdot 3 + 2=6+2=8\] \[a_3 = 2a_2+a_1=2\cdot 8 + 3=16+3=19\]

\[a_4 = 2a_3+a_2=2\cdot 19 + 8=38+8=46\]

9.2 The Linear Growth Model

Before we introduce this model, we introduce arithmetic sequences. An arithmetic sequence is a sequence such that each term equals the immediate previous term plus a constant. The constant, usually denoted by \(d\), is called the common difference. When \(d\) is positive, the sequence increases or grows; when \(d\) is negative, the sequence decreases or shrinks; when \(d\) is 0, the sequence stays constant.

The following sequences are all arithmetic sequences:

  1. 2, 5, 8, 11, 14, 17, … with \(d=3\).

  2. 6, 10, 14, 18, 22, 26, 30, … with \(d=4\).

  3. \(10, 7, 4, 1, -2, -5, -8,\), … with \(d=-3\).

  4. 2.4, 3.9, 5.4, 6.9, 8.4, … with \(d=1.5\).

Given an arithmetic sequence with the first term \(a_1\) and the common difference \(d\),

  • the second term can be determined by \(a_2=a_1+d=a_1+(2-1)d\),

  • the third term can be determined by \(a_3=a_2+d=a_1+d+d=a_1+2d=a_1+(3-1)d\),

  • the fourth term can be determined by \(a_4=a_3+d=a_1+2d+d=a_1+3d=a_1+(4-1)d\), and

  • The pattern suggests that \(n\)th term can be determined in terms of \(a_1\) and \(d\) by the formula:

\[a_n = a_0 + (n-1)d\]

We might want to find the sum of the first certain number of terms of an arithmetic sequence. For example, how can we find the first 10 terms in the arithmetic sequence 3, 7, 11, …?

We can list the first 10 terms:

\[3, 7, 11, 15, 19, 23, 27, 31, 35, 39\]

We can pair the 10 numbers as this: (3, 39), (7, 35), (11, 31), (15, 27), and (19, 23).

Note that the sum of the two numbers in each pair is the same which is 42, and there are 5 such paris, so the sum of all the 10 terms equals \(42\cdot 5=210\). Let’s do a trick. Rewrite \(42\cdot 5\) as \(42\cdot 5=42\cdot \frac{10}{2}=\frac{42}{2}\cdot 10=\frac{3+39}{2}\cdot 10\), which is the average of the first term and the last term, multiplied by the number of terms!

In general, when a sequence is an arithmetic sequence, the sum of the first certain number of terms equals the average of the first term and the last term, multiplied by the number of terms.

Example 4.

The first two terms of an arithmetic sequence are 3 and 8. Find the sum of the first 100 terms.

Solution.

First of all, \(a_1=3\) and \(d=8-3=5\).

To find the sum of the first 100 terms, we need to find the 100th term, which, by the formula \(a_n = a_0 + (n-1)d\), is given by

\[a_{100} = a_1+99d=3+99\cdot 5=3+495=498\] The sum of the first 100 terms equals the average of the first term and the last term, multiplied by the number of terms which is 100. That is, \(\frac{3+498}{2}\cdot 100=25050\).

Example 5.

The following is an arithmetic sequence

\[5, 11, 17, ..., 491\] (a) How many terms are there in the sequence?

  1. Find the sum of the terms in the above sequence.

Solution.

First of all, \(a_0=5\) and \(d=11-5=6\).

  1. Suppose the 491 is the \(n\)th term in the sequence. Since the \(n\)th term in terms of the first term and the common difference is given by \(a_n = a_0+(n-1)d\), to find \(n\), we set \(491=5+(n-1)\cdot 6\). Subtracting 5 from both sides gives \(486=(n-1)\cdot 6\). Dividing both sides by 6 gives \(81=n-1\). Thus, \(n = 82\). That is, there are 82 terms in the given sequence.

  2. The sum of the these 82 terms equals the average of the first term (5) and the last term (491), multiplied by the number of terms which is 82. That is, \(\frac{5+491}{2}\cdot 82=20336\).

A population is said to grow linearly, if the population can be modeled by an arithmetic sequence; that is, in each generation the population changes by a constant amount. The model is called a linear growth model.

If the initial population is \(P_0\) and the constant is \(d\), the sequence can be written as

\[P_0, ~~P_0 +d, ~~P_0+2d, ~~P_0+3d, ~~p_0+4d, ...\] An explicit formula for the \(n\)th term of the population sequence is

\[P_n = P_0 + n\cdot d\] When the constant \(d\) is positive, the population grows linearly; when the constant \(d\) is negative, the population shrinks linearly, and in this case the sequence ends at the last positive term (because population cannot be negative).

A recursive formula for the \(n\)th term of the population sequence is

\[P_n = P_{n-1} + d\] Example 6.

If a population with initial size \(P_0=80\) grows linearly with constant 125, what is \(P_{50}\) (the population after 50 generations)?

Solution.

By the formula,

\[P_n = P_0 + n\cdot d\] \[P_{50} = 80 + 50\cdot 125=80+6250=6330\]
That is, the population will be 6330 after 50 generations.

Example 7.

A population with initial size \(P_0=578\) shrinks linearly following the recursive formula \(P_n=P_{n-1}-25\).

  1. Find the \(P_1, P_2\), and \(P_3\).

  2. Give an explicit formula for \(P_n\).

  3. Find \(P_{10}\) (the population after 50 generations)?

  4. After how many generations will the growth model fail?

Solution.

  1. The 3 terms are

\[P_1=P_0-25=578-25=553\] \[P_2=P_1-25=553-25=528\]

\[P_3=P_2-25=528-25=503\]

  1. Since \(P_n=P_{n-1}-25\) can be written as \(P_n=P_{n-1}+(-25)\), the sequence is an arithmetic sequence with \(P_0=578\) and \(d=-25\), and the explicit formula is

\[P_n = P_0 + nd=578+n(-25)=578-25n\] or \[P_n = 578-25n\]

  1. \(P_{10} = 578-10(50)=578-500=78\).

  2. According to part (b), after \(n\) generations, the population becomes \(P_n = 578-25n\). When the population growth model fails, \(P_n\le 0\). To find \(n\), we need to solve the inequality \(578-25n\le 0\) or \(578\le 25n\). Dividing both sides by 25 gives \(23.12 \le n\). The smallest such \(n\) is 24. That is after 23 generations, the growth model fails to work.

The Exponential Growth Model

Before we introduce the exponential growth model, we need to introduce the geometric sequences.

A sequence is a geometric sequence, if the ratio of each term to its immediate previous term is a constant. Such a constant, usually denoted by \(R\), is called the common ratio.

If a geometric sequence \(a_1, a_2, a_3, ...\) has a common ratio \(R\), then

  • \(a_2=R(a_1)=R^{2-1}(a_1)\),

  • \(a_3=R(a_2)=R(Ra_1)=R^2 a_1=R^{3-1}(a_1)\),

  • \(a_4=R(a_3)=R(R^2a_1)=R^3a_1=R^{4-1}(a_1)\), and

  • In general, \(a_n=R^{n-1}a_1\).

If a population grows from \(X\) in one generation to \(Y\) in the next generation, we call the quantity \(\frac{Y-X}{X}\) the growth rate, denoted by \(r\). Setting \(\frac{Y-X}{X}=r\) gives \(Y-X=rX\) or \(Y=X+rX\) or \(Y=(1+r)X\). That is, the population grows by a factor of \(1+r\).

A population is said to grow exponentially or can be described by an exponential growth model, if in each generation the population grows by the same constant factor (\(1+r\)). That is, the sizes of an exponential growth population in different generations can be described by a geometric sequence with a common ratio \(R=1+r\). Specifically, if the initial population is \(P_0\), then

  • the population after one generation is \((1+r)\cdot P_0\),

  • the population after two generations is \((1+r)^2\cdot P_0\),

  • the population after three generations is \((1+r)^3\cdot P_0\), and so on.

When \(r>0\), \(R>1\); when \(-1<r<0\), \(0<R<1\).

The Omicron variant has an average basic reproduction number (\(R\)) of 9.5. The growth rate \(r\) is \(9.5-1=8.5\).

Example 8.

  1. On March 16, 2016, the number of confirmed COVID-19 cases in USA was 27486. In March 23, 2016, the number of cases was 94894. What was the growth rate? If the rate continued, what was the estimated number of cases in April 27, 2020?

  2. In 2010, 100,000 cases of equine flu were reported. In 2011, the number of new cases was 80,000. What was the growth rate? If the rate continued, what was the estimated number of cases in 2020?

Solution.

  1. The growth rate \(r=\frac{94894-27486}{27486}=2.45\) and \(R=1+r=3.45\). If the rate continued, 5 weeks later, the number of cases would be \(P_5=R^{5}P_0=3.45^{5}\cdot 94894=46,380,372\), approximately 46,380,372, not a good estimate.

  2. The growth rate \(r=\frac{80,000-100,000}{100,000}=-0.20\). That is, the number of cases decreased by 20%. Since \(P_0 = 100,000\) and \(R=1+r=1-0.20=0.80\), \(P_10=R^{10}P_0=0.80^{10}\cdot 100,000=10737.42\), approximately 10737.

The sum of the first \(n\) terms of a geometric sequence with initial term \(a_0\) and common ration \(R\) is given by the following formula:

\[a_0\cdot \frac{R^n-1}{R-1}\] Example 9.

Find the sum of the first 10 terms of the geometric sequence 1, 2, 4, 8, ….

Solution.

We have \(a_0=1\) and \(R=2\), so the sum of the first 10 terms is

\[a_0\cdot \frac{R^n-1}{R-1}=1\cdot \frac{2^{10}-1}{2-1}=1023\]

The Logistic Growth Model

We consider the modeling of the number of individuals that a habitat can carry. Due to the restriction of resources, we assume this number of individuals is at most \(C\), called the carrying capacity.

In the \(n\)th generation with population \(P_n\),

  • the difference, \(C-P_n\), is called the “elbow room” for that population, the amount available for the population to grow.

  • the ratio, \(p_n=\frac{P_n}{C}\), is called the “p-value” of that population (just like the occupancy rate at a hotel). The \(1-p_n\) is called the relative “elbow room” for the population.

We say that a population follows a logistic growth model, if the population growth can be modeled by the recursive formula: \[p_{n+1}=r(1-p_n)p_n\]

where \(r\) is called the growth parameter, an inter-generational growth rate. The value \(p_0\) behaves like a seed.

Example 10.

A population grows according to the logistic growth model, with growth parameter \(r=1.8\). Starting with an initial population given by \(p_0=0.4\).

  1. find the values of \(p_1\) and \(p_10\).

  2. what does the logistic growth model predict in the long term for this population?

Solution.

  1. By the recursive formula, \(p_{n+1}=r(1-p_n)p_n\), \[p_1=r(1-p_0)p_0=1.8(1-0.4)0.4=0.432\] \[p_2=r(1-p_1)p_1=1.8(1-0.432)0.432=0.4417\] \[p_3=r(1-p_2)p_2=1.8(1-0.4417)0.4417=0.4439\] By the same token, \(p_4=0.4443, p_5=0.4444, p_6=0.4444, p_7=0.4444, p_8=0.4444, p_9=0.4444, p_10=0.4444\)

  2. In the long run, the p-value for the population approaches 0.4444, or \(\frac{4}{9}\).

Example 11.

Fish farming is a big business, so you decide to give it a try. You have access to a large, natural pond in which you plan to set up a rainbow trout hatchery. The carrying capacity of the pond is \(C=10,000\) fish, and the growth parameter of this type of rainbow trout is \(r=2.5\).

You start by seeding the pond with an initial population of 2000 rainbow trout. Use the logistic model for the growth of the fish population in your pond.

  1. Find the values of \(p_1\) through \(p_5\).

  2. what does the logistic growth model predict in the long term for this fish population?

Solution.

We know that \(r = 2.5, P_0 = 2000\) and \(C = 10,000\), so \(p_0 = \frac{P_0}{C}=\frac{2000}{10,000}=0.2\)

  1. The p-values are:

\[p_1=r(1-p_0)p_0=2.5(1-0.2)0.2=0.4\] \[p_2=r(1-p_1)p_1=2.5(1-0.4)0.4=0.6\]
\[p_3=r(1-p_2)p_2=2.5(1-0.6)0.6=0.6\]
\[p_4=p_5=0.6\]

Chapter 10. Financial Mathematics

Here are some facts from national surveys:

This chapter will provide you with a few tools needed for a basic understanding of financial mathematics. We start from some basic mathematics that are essential.

10.1 Percentages

Cut a pizza into three equal pieces. The fraction \(\frac{2}{3}\) represents the shares when 2 pieces are taken from the whole pizza. Cut a copy of the first pizza into 4 equal pieces and you take 3 of them, what you have is \(\frac{3}{4}\) of the second pizza. Is \(\frac{3}{4}\) bigger than \(\frac{2}{3}\)? It is not immediately clear, since the two fractions have different denominators which correspond to different cuts of basically the same pizza. Now, if you further cut each piece of the first pizza into four smaller pieces and cut each piece of the second pizza into three smaller pieces, you will find \(\frac{3}{4}\) is bigger than \(\frac{2}{3}\), since both pizzas are actually cut into 12 equal pieces, and \(\frac{3}{4}\) represents nine smaller pieces and \(\frac{2}{3}\) represents eight smaller pieces.

A comparison of fractions with different denominators is inconvenient, so this is a reason why human beings use the common denominator 100 in real life. A fraction with numerator \(x\) and denominator 100 can be written as \(x\%\), which is called a percentage and is read “x percent”. For example, the percentages \(5\%\) and \(125\%\) are the same as \(\frac{5}{100}\) and \(\frac{125}{100}\), respectively.

Convert Numbers to Percentages

An important property of fractions is that the fraction remains the same when both numerator and denominator are multiplied by a non-zero number. Symbolically,

\[\frac{a}{b}=\frac{a\cdot c}{b\cdot c}\] If you are going from the right side to the left side, you are simplifying (or reducing) the fraction on the right side.

Example 1.

You got 18 out of 20 in a Statistics class and you got 44 out of 50 in a Mathematics class. This means you got \(\frac{18}{20}=\frac{90}{100}\) or \(90\%\) of all possible points in Statistics and you got \(\frac{44}{50}=\frac{88}{100}\) or \(88\%\) of all possible points in Mathematics. Therefore, you did better in Statistics, relatively.

Example 2.

Write the following fractions as percentages.

  1. \(\frac{12}{100}\)

  2. \(\frac{5}{25}\)

  3. \(\frac{120}{200}\)

Solution.

  1. \(12\%\)

  2. \(\frac{5}{25}=\frac{20}{100}\), or \(20\%\).

  3. \(\frac{120}{200}=\frac{60}{100}\), or \(60\%\).

For a number with a decimal point, it is numerically equal to the fraction with the numerator being the number without the decimal point and the denominator being 10, 100, 1000, and so on, depending on how many digits are following the decimal point. For example,

  • 0.3 is the same as \(\frac{3}{10}\), since there is one digit following the decimal point.

  • 0.28 is the same as \(\frac{28}{100}\), since there is two digits following the decimal point.

  • 0.437 is the same as \(\frac{437}{1000}\), since there is three digits following the decimal point.

Example 3.

Which of the following are numerically the same?

\[\frac{3}{5}, ~0.75, ~0.6, ~\frac{6}{8}, ~60\%\] Solution.

\[\%60 =\frac{60}{100}= \frac{6}{10}=\frac{3}{5} = 0.6\] \[0.75=\frac{75}{100}=\frac{3}{4}=\frac{6}{8}\]

Convert Percentages to Decimals

Many times, we need to convert a percentage to a decimal. To do so, first write the percentage as a fraction with the denominator being 100 and the numerator being a number with a decimal point (if the numerator is an integer, just add the decimal point at the end). The result is the numerator with the decimal point moved to two places to the left.

For example, the percentage 23% is the same as 0.23, 42.5% is the same as 0.425, and 0.203% is the same as 0.00203.

If you have a percentage such as \(\frac{1}{4}\%\), you first need to convert the numerator to the decimal 0.25, and then move the decimal point left by two places to get the result 0.0025.

If you have a percentage such as \(3\frac{2}{5}\%\), you first need to convert the numerator to the decimal 3.4, and then move the decimal point left by two places to get the result 0.034.

Example 4.

Express each of the following percentages as a decimal.

  1. 5.36%

  2. \(12\frac{3}{4}\%\)

  3. 0.45%

  4. 0.004%

Solution.

  1. 0.0536

  2. \(12\frac{3}{4}\%\) is the same as \(12.75\%\), so the answer is 0.1275.

  3. 0.0045

  4. 0.00004.

To find \(x\%\) of \(b\), we do \(x\%\cdot b\). For example, Jerry agrees to give Tom 20% of his money. If Jerry has $30, how much does he give Tom? The solution: 20% of 30 equals \(20\%\cdot 30\), which is the same as \(0.20\cdot 30\), or $6.

Percentage Increase and Percentage Decrease

When a number \(a\) is increased to \(b\), the percentage increase is defined to be \(\frac{b-a}{a}\cdot 100\%\). For example, if Tom’s salary is increased from $50,000 to $54,000, then the percentage increase is \[\frac{54,000-50,000}{50,000}\cdot 100\%\]

or,

\[0.08\cdot 100\%\] or 8%.

When a number \(a\) is decreased to \(b\), the percentage decrease is defined to be \(\frac{a-b}{a}\cdot 100\%\). For example, if the price of a Jacket is decreased from $50 to $40, then the percentage decrease is \[\frac{50-40}{50}\cdot 100\%\]

or,

\[0.20\cdot 100\%\] or 20%.

When increasing 20 by 10%, we add the amount of increase, which is 20 times 10%, to the original amount, which is 20. That is, we do

\[20 + 20\cdot 10\%\] Factor the common factor 20 out to get

\[20(1+ 10\%)\] which is equivalent to

\[20(1 + 0.10)\] or, \[20(1.10)\] or 22.

The example suggests the following formula:

When \(x\) is increased by \(y\%\), it becomes \(x(1+y\%)\).

Similarly, when decreasing 20 by 10%, we subtract the amount of decrease, which is 20 times 10%, from the original amount, which is 20. That is, we do

\[20 - 20\cdot 10\%\] Factor the common factor 20 out to get

\[20(1- 10\%)\] which is equivalent to

\[20(1 - 0.10)\] or, \[20(0.90)\] or 18.

We have another formula:

When \(x\) is decreased by \(y\%\), it becomes \(x(1-y\%)\).

Example 5.

A 500-piece puzzle is missing 12% of its pieces from its box. How many pieces are in the box?

Solution.

We need to find what it is when 500 is decreased by 12%. The answer is \(500(1-12\%)\), which is \(500(1-0.12)\), or \(500(0.88)\), or 440.

Example 6.

At a mall, you buy a jacket at the price of $125. If the tax rate is 6.80%, how much do you pay for the bill?

Solution.

You need to find what it is when 125 is increased by 6.80%. The answer is \(125(1+6.80\%)\), which is \(125(1+0.068)\), or \(125(1.068)\), or $133.50.

Example 7.

At a mall, you buy T-shirt that is marked $80. After the sales tax, the bill is $85.28, what is the tax rate (to the nearest tenth of a percent)?

Solution.

Let the tax rate be \(r%\). Then, we have the equation: \[80(1+r\%)=85.28\]

Dividing both sides by 80 yields

\[1+r\%=1.066\] Subtracting 1 from both sides yields \[r\%=0.066\] which is the same as

\[\frac{r}{100}=0.066\] Multiplying 100 to both sides yields \(r=6.60\). So, the answer is 6.6% (rounded to the nearest tenth of a percent).

Example 8.

For three consecutive years the tuition at a private college increased by 8%, 12%, and 6%, respectively. If the original tuition was $32,000, what was the tuition after three years?

Solution.

After the first increase, the tuition became \(32,000(1+8\%)\) in dollars, or $34,560. After the second increase, the tuition became \(34,560(1+12\%)\) in dollars, or $38,707.20. After the third increase, the tuition became \(38,707.20(1+6\%)\) in dollars, or $41,029.63 (rounded to the nearest hundredth).

The calculation is equivalent to

\[32,000(1+8\%)(1+12\%)(1+6\%)\] which equals $41,029.63 (rounded to the nearest hundredth).

Example 9.

For three consecutive years the tuition at a private college increased by 8%, 12%, and 6%, respectively. What was the overall percentage increase of tuition during the three-year period?

Solution.

Let’s assume that the original tuition was $P. Then, after the three increases, the tuition became \(P(1+8\%)(1+12\%)(1+6\%)\) in dollars. Let’s denote it by \(P_1\).

The overall percentage increase is

\[\frac{P_1 - P}{P}\cdot 100\%\]

or,

\[\frac{P(1+8\%)(1+12\%)(1+6\%) - P}{P}\cdot 100\%\] or,

\[\frac{P(1.08)(1.12)(1.06) - P}{P}\cdot 100\%\] or,

\[\frac{P(1.282176) - P}{P}\cdot 100\%\]

Factoring \(P\) from the numerator yields

\[\frac{P[1.282176 - 1]}{P}\cdot 100\%\] canceling \(P\) and simplifying the part in the brackets yields

\[0.282176\cdot 100\%\] or 28.22% (rounded to the nearest hundredth of a percent).

A reflection: since \(P\) cancels, the answer does not dependent on \(P\); that is, you can choose any \(P>0\), as long as it is convenient for calculation.

As a final note to this section, we point out the following:

  • “Three times more than” is the same as “Four times as many (much) as”

  • Jumping from 5% to 6% is NOT a 1% increase, but a 20% increase, since
    \[\frac{6\%-5\%}{5\%}\cdot 100\% = 25\%\] You can say that 6% is one percentage point higher than 5%.

10.2 Simple Interest

In finance, interest is the price someone (the borrower) pays for the use of the money of someone else (the lender). It can be thought as a rental cost applied not to a car but to money.

A loan is money, property, or other material goods a lender gives to a borrower in exchange for future repayment of the loan value amount with interest.

The key variables involved in the computation of the interest on a loan are:

  • Principal (P). This is the amount of money the lender lends the borrower.

  • Interest rate (r) This is the percentage that the lender charges the borrower for the use of the money for a specific unit of time – usually a year. When the unit time is a year, the interest rate is called an annual percentage rate (APR).

  • Term (t). It is the life of the loan, typically in years or months.

  • Repayment Schedule. This is a schedule agreed upon by lender and borrower for the repayment of the loan. In single payment loans the borrower repays the loan in a single lump sum payment at the end of the term. In installment loans the borrower repays the loan by making equal monthly payments over the term of the loan. In credit card loans, the repayment schedule is up to a point at the discretion of the borrower.

  • Simple Interest. This applies only to the principal. If the Principal is \(P\), the interest rate (APR) is \(r\) expressed in percentage, and the term is \(t\) in years, then the interest is calculated by the formula: \(I = prt\). The typical repayment schedule for a simple interest loan is a single payment at the end of the loan.

  • Compound Interest. When a loan is based on compound interest, interest is calculated one or multiple times in each year, and each next interest is calculated based on the principal and the previous interest. The typical repayment schedule for a compound interest loan requires the borrower to make installment payments (say monthly).

Example 10.

Suppose you borrow $875 for a term of four years at simple interest and 4.28% APR. Flow much is the total (principal plus interest) you must pay back on the loan?

Solution.

You are given: \(P=\$875\), \(t=4\) years, and \(r=4.28\%\).

The interest \(I = prt=875(0.0428)(4)=\$149.80\).

So, you pay pack the principal plus the interest, which is \(875+149.80\) or $1024.80.

Simple interest is also used on certain types of investment instrument, such as government and corporate bonds. Government agencies, including the U.S. government, states, cities, and municipalities issue bonds as a way to raise money. Private corporations do the same. When you buy a bond, you are essentially lending your money to the agency issuing the bond. The variables associated with the typical bonds are the term of the bond (a few years to 30 years), the APR, the face value (or par value, denoted by \(F\)) of the bond. The face value of a bond is the amount you get back from the issuing agency when the bond reaches maturity (i.e., at the end of the term).

While corporate bonds all have some level of default risk (no matter how small), U.S. Treasury bonds are used as a benchmark by the market because they have no default risk. Therefore, corporate bonds always earn a higher interest rate than Treasury bonds.

Example 11.

Suppose you purchase a four-year bond with an APR of 5.75%. The face value of the bond is $4920. Find the purchase price of the bond.

Solution.

You are given: \(F = 4920\), \(t=4\) years, and \(r=5.75\%\). You want to find \(P\), the purchase price of the bond.

Here, the face value of the bond is a single lump sum you get after you invest \(P\) dollars in 4 years. It is equal to \(P\) plus the interest. That is, \(F = P + Prt\). Factoring \(P\) out yields \(F = P(1 + r\cdot t)\). This is the formula that links \(F\) to \(P\), \(r\), and \(t\). Plugging values in this equation yields \(4920 = P(1+0.0575(4))\) or \(4920 = 1.23P\), which gives \(P=\$4000\).

The example shows that knowing \(F\), \(r\), and \(t\), the price of a bond \(P\) can be calculated by the formula: \(P = \frac{F}{1+rt}\).

Example 12.

Suppose you purchase an eight-year bond for $5400. The face value of the bond when it matures is $8316. Find the APR.

Solution.

You are given: \(F = 8316\), \(t=8\) years, and \(P=5400\). You want to find \(r\), the APR of the eight-year bond.

Using the formula \(F = P(1 + r\cdot t)\) gives the equation:

\[8316 = 5400(1 + 8r)\] Dividing both side by 5400 yields \(1.54=1+8r\) or \(0.54=8r\) or \(r=0.0675\). That is, the APR is 6.75%.

Example 13.

Advance America is a payday loan company that offers quick, short-term loans using the borrower’s future paychecks as collateral. Advance America charges $17 for each $100 loaned for a term of 14 days. Find the APR charged by Advance America.

Solution.

We are given \(I = \$17\), \(P=\$100\), and \(t = 14\) days, or \(t=\frac{14}{365}\) years.

Plugging these numbers in the simple interest formula \(I = Prt\) yields

\[17 = 100\cdot r\cdot\frac{14}{365}\] Multiplying 365 to both sides yields \[6205 = 1400r\]

Dividing both sides by 1400 gives \(r = 4.4321\), or \(443.21\%\) (rounded to the nearest hundredth of a percent). Since the APR of a payday loan can be as high as 450%, some states have banned payday loan companies form doing business.

Example 14.

Treasury bills (or T-bills) are short-term securities that mature in one year or less from their issue date. T-bills are purchased for a price less than or equal to their par (face) value, and when they mature, Treasury pays their par value. The interest is the difference between the purchase price of the security and what is paid at maturity (or what it sells for if it is sold before it matures). If an investor bought a $10,000, 26-week Treasury bill for $9,750 and held it until maturity, what is the interest and what is the APR?

Solution.

The interest would be \(10,000 - 9,750=\$250\).

Since \(I = Prt\), where \(I = 250\), \(P=9,750\), and \(t = 26\) weeks or 0.5 year, we have \(250 = 9,750\cdot r\cdot 0.5\). Solving the equation gives \(APR = 0.0.0513\) or 5.13% (rounded to the nearest hundredth of a percent).

A comparison of saving’s accounts, bonds, and CDs:

A comparison of saving’s accounts, bonds, and CDs
Investment Type Maturity date Lender Borrower Interest Type Interest Taxable? Penalty with Early Withdrawal
Checking account No owner of the account Financial institute usually compund daily Yes No
Saving’s account No owner of the account Financial institute Compound Yes No
Bond >10 years buyer Company or government Simple Yes may incur a penalty if cash early
CD a few months to a frew years buyer Financial institute Compound Yes Yes

10.3 Compound Interest

An ordinary savings account has no term and allows you to withdraw money any time. Because of liquidity, the interest rate on such accounts are usually low.

A certificate of deposit (CD) is a savings account that holds a fixed amount of money for a fixed period of time (a few months to a few years), and in exchange, the issuing bank or credit union pays interest at a fixed APR. When you cash in or redeem your CD, you receive the money you originally invested plus any interest. Unlike a bond, the interest on a CD is usually compounded annually, but monthly, or semi-annually compounding is also possible.

Advantages and disadvantages of an investment on a CD:

  • Advantages- safe, high rate of return than ordinary savings accounts

  • Disadvantage- fixed term, early withdrawal penalty

The formula for calculating the future value of an amount invested in an account (such as a CD) for certain years with compound interest is given below:

\[F = P(1+\frac{r}{n})^{nt}\]

where

  • \(P\) = Present Value (Initial investment)

  • \(r\) = rate of interest, expressed as a decimal

  • \(n\) = number of times the interest is compounding in a year

  • \(t\) = term in years

  • \(\frac{r}{n}\) is called the periodic interest rate.

Note: When the interest is compounding annually, \(n=1\); when the interest is compounding semi-annually, \(n=2\); when the interest is compounding monthly, \(n=12\); when the interest is compounding quarterly, \(n=4\); when the interest is compounding daily, \(n=365\); when the interest is compounding continuously, \(n\) approaches to \(\infty\) and the formula becomes \(F=e^{rt}\), where \(e \approx 2.71828\).

Example 15.

Find the future value of an investment of $3,250 compounded annually with a 9% APR for a term of (a) four years. (b) five and a half years.

Solution.

  1. We are given: \(P=\$3,250\), \(r = 9\% = 0.09\), \(n=1\), and \(t = 4\) years. So, The future value is

\[F = P(1+\frac{r}{n})^{nt}=3,250(1+\frac{0.09}{1})^{1\cdot 4}=3,250 \cdot 1.09^4=3,250\cdot 1.41158=\$4,587.64\] (b) We are given: \(P=\$3,250\), \(r = 9\% = 0.09\), \(n=1\), and \(t = 4.5\) years. So, The future value is

\[F = P(1+\frac{r}{n})^{nt}=3,250(1+\frac{0.09}{1})^{1\cdot 4.5}=3,250 \cdot 1.09^{4.5}=3,250\cdot 1.473734=\$4,789.64\] Example 16.

Between 1990 and 2010 the average annual inflation rate was 3.5%. Find the salary in 2010 dollars that would be equivalent to a salary $25,000 in 1990 .

Solution.

We are given: \(P=\$25,000\), \(r = 3.5\% = 0.035\), \(n=1\), and \(t = 20\) years. So, The future value is

\[F = P(1+\frac{r}{n})^{nt}=25,000(1+\frac{0.035}{1})^{1\cdot 20}=25,000 \cdot 1.035^{20}=3,250\cdot 1.989789=\$49,744.72\]

Example 17.

Consider a CD paying a 3% APR compounded monthly. (a) Find the periodic interest rate. (b) Find the future value of the CD if you invest $1,580 for a term of three years. What is the total interest?

Solution.

We are given: \(P=\$1,580\), \(r = 3\% = 0.03\), \(n=12\), and \(t = 3\) years. So, The future value is

\[F = P(1+\frac{r}{n})^{nt}=1,580(1+\frac{0.03}{12})^{12\cdot 3}=1,580 \cdot 1.03^{36}=1,580\cdot 2.898278=\$4,579.28\]

The total interest is \(4,579.28 -1,580=\$2999.28\).

Example 18.

Consider a CD paying a 3.6% APR compounded continuously. Find the future value of the CD if you invest $1,580 for a term of three years. What is the total interest?

Solution.

We are given: \(P=\$1,580\), \(r = 3.6\% = 0.036\), \(n=\infty\), and \(t = 3\) years. So, The future value is

\[F = Pe^{rt}=1,580{2.71828}^{0.036\cdot 3}=1,580 {2.71828}^{0.108}=1,580\cdot 1.11405=\$1,760.20\].

The total interest is \(1,760.20 -1,580=\$180.20\).

Example 19.

Consider a CD paying a 3.65% APR compounded continuously. Find the future value of the CD if you invest $1,580 for a term of 500 days. Round your answer to the nearest dollar.

Solution.

We are given: \(P=\$1,580\), \(r = 3.65\% = 0.0365\), \(n=\infty\), and \(t = \frac{500}{365}\approx 1.369863\) years. So, The future value is

\[F = Pe^{rt}=1,580{2.71828}^{0.0365\cdot 1.369863}=1,580 {2.71828}^{0.05}=1,580\cdot 1.05127=\$1,661\]

rounded to the nearest dollar.

It is difficult to compare two APRs when the frequency of compounding differs. The annual percentage yield (APY) (also known as the effective rate) is the right one instead of the APR. The APY is calculated by the following formula:

\[APY=(1+\frac{r}{n})^n-1\] where

  • \(n\) = the frequency of compounding

  • \(r\) = rate of interest (i.e., APR), expressed as a decimal.

Note: when the compounding is continuous, the formula becomes \[APY=e^r-1\] where \(e\approx 2.71828\) and \(r\) is the rate of interest (i.e., APR), expressed as a decimal.

Warning: The textbook use \(T\) instead of \(n\).

Example 20

Which is a better investment: a 4.5% APR compounded quarterly or 4% APR compounded continuously?

Solution.

We can answer this question by calculating the APY of each investment.

The APY of the first investment:

\[APY=(1+\frac{r}{n})^n-1 = (1+\frac{0.045}{4})^4-1\approx 0.045765=4.5765\%\] The APY of the second investment:

\[APY=e^r-1=2.71828^{0.04}-1\approx 0.040811=4.0811\%\] Since the the first investment has higher APY, it is better.

10.4 Retirement Savings

A retirement savings account is one in which you make contributions at the end of each year (or month) until you retire (including the retirement year). In fact, most employer-sponsored retirement savings plans (such as individual retirement accounts (IRA’s), 401k’s, and 403b’s) use an automatic monthly contribution system. That is, just tell your employer how much to take out of your paycheck each month, and the money goes directly into a retirement fund of your choosing. It is possible that your employer matches your contribution.

The value of a retirement savings account after certain number of years can be calculated by the following formula:

\[V = P\frac{(1+r)^Y-1}{r}\] where

  • \(P\) = Payment at the end of each year until retirement (including the retirement year)

  • \(r\) = rate of interest, expressed as a decimal

  • \(Y\) = number in years (or payments)

Note: When payment is made at the end of each month, including the last month, the formula becomes \[V = P\frac{(1+\frac{r}{12})^{12\cdot Y}-1}{\frac{r}{12}}\] where \(\frac{r}{12}\) is the monthly interest rate and \(12\cdot Y\) is the total number of months (or payments).

Example 21.

Find the value of a retirement savings account paying an APR of 6.6% after 45 years (contributions made at the end of each year, including the last year) when the annual contribution is $1,500.

Solution.

We are given: \(P=\$1,500\), \(r = 6.6\% = 0.066\), and \(t = 45\) years. So, the value of the retirement savings account after 45 years is

\[V = P\frac{(1+r)^Y-1}{r}=1,500\frac{(1+0.066)^{45}-1}{0.066}=1,500\frac{16.7449}{0.066}=\$403,293.79\]

Example 22.

Find the value of a retirement savings account paying an APR of 6.6% (compounded monthly) after 45 years of monthly contributions (contributions made at the end of each month, including the last month) when the monthly contribution is $125.

Solution.

We are given: \(P=\$125\), \(r = 6.6\% = 0.066\), and \(t = 45\) years. So, the value of the retirement savings account after 45 years is

\[V = P\frac{(1+\frac{r}{12})^{12\cdot Y}-1}{\frac{r}{12}}= 125\frac{(1+\frac{0.066}{12})^{12\cdot 45}-1}{\frac{0.066}{12}}=125\frac{18.3339}{0.0055}=\$416679.55\] ### Consumer Debt

U.S. consumers grew their non-housing debt to 4.33 trillion in the fourth quarter of 2021.

nonHousingDebt Credit card debt accounted for around one fifth of it (856 billion), and much of the rest was some form of installment loan debt such as student loans and auto loans.

10.5 Consumer Debt

We focus on the two most common generators of consumer debt in a modern economy- credit cards and installment loans.

Calculating the Balance of a Credit Card

Example 23. (Book Example 10.24 redone, p.314)

The previous balance $500 will be due in 31 days;

The May 28 bill $85.75 will be due in 28 days;

The June 5 bill $45 will be due in 20 days;

The June 16 bill $73.60 will be due in 9 days.

So, by the formula for total based on the simple interest, the new balance will be

\[P_1(1+\cdot r\cdot t_1)+P_2(1+\cdot r\cdot t_2)+P_3(1+\cdot r\cdot t_3)+P_4(1+\cdot r\cdot t_4)\] where \(r\) is the APR, \(P_1, P_2,P_3\) and \(P_4\) are the individual amounts owed, and \(t_1, t_2,t_3\) and \(t_4\) are the numbers of years (days converted to years by dividing 365!!!) to the next due date.

Plugging the values \(r=0.1724\), \(P_1=500, P_2=85.75,P_3=45\), \(P_4=73.60\), \(t_1=31/365, t_2=28/365,t_3=20/365\) and \(t_4=9/365\) in the above expression gives $713.54, the new balance!

New Balance Calculation
Items Amounts Number of Days to Next Due Date Number of Years to Next Due Date r (APR) Product of Columns 2, 4, 5 (Interest)
Previous balance 500.00 31 0.0849315 0.1724 7.3210959
Phone bill payment 85.75 28 0.0767123 0.1724 1.1340614
Gasoline purchase 45.00 20 0.0547945 0.1724 0.4250959
Groceries 73.60 9 0.0246575 0.1724 0.3128706
Note:
The last column gives the interest of each item based on the simple interest formula \(I = Prt\). To find the average daily balance, multiply the second and third columns, then divide the sum by 31. Your answer should be $627.85.

Adding the numbers in the second column gives the sum of individual items, which is $704.35. Adding the numbers in the last column gives the total finance charge, which is $9.19. Adding these two numbers gives the new balance $713.54, same as the textbook answer.

Amortizing an Installment Loan

The process of paying off an installment loan (such as loan on cars or homes) by making regular installment payment over a certain length of time is called amortization. Installment loans typically have much lower APRs than other types of consumer loans because the item financed by the loan serves as collateral.

Given the amount to be amortized, APR, and the number of monthly payments, the monthly payment can be calculated with the following formula:

\[M=P\cdot\frac{i(1+i)^n}{(1+i)^n -1}\] where

  • \(M\) = monthly payment amount

  • \(P\) = amount of principal

  • \(i\) = monthly interest rate, which equals \(\frac{r}{12}\) with r being the APR

  • \(n\) = total number of payments (textbook uses \(T\))

Example 24.

Suppose you purchase a car and you are going to finance $14,500 for 36 months at an APR of 6% compounded monthly. Find the monthly payments on the loan.

Solution.

You are given: \(P=\$14,500\), \(n = 36\), \(r = 0.06\), and thus the monthly interest rate \(i = \frac{r}{12}=\frac{0.06}{12}=0.005\). You want \(M\), the monthly payments on the loan.

By the amortization formula, the monthly payment should be

\[M=P\cdot\frac{i(1+i)^n}{(1+i)^n -1}=14,500\cdot\frac{0.005(1+0.005)^{36}}{(1+0.005)^{36} -1}=\$441.12\]

Example 25.

The Simpsons are planning to purchase a new home. To do so, they will need to take out a 30-year home mortgage loan of $160,000 through Middletown Bank. Annual interest rates for 30-year mortgages at Middletown Bank are 5.75% compounded monthly.

  1. Compute the Simpsons’ monthly mortgage payment under this loan.

  2. How much interest will the Simpsons pay over the life of the loan?

Solution.

  1. You are given: \(P=\$160,000\), \(n = 12\cdot 30=360\), \(r = 0.0575\), and thus the monthly interest rate \(i = \frac{r}{12}=\frac{0.0575}{12}=0.00479\). You want \(M\), the monthly payments on the loan.

By the amortization formula, the monthly payment should be

\[M=P\cdot\frac{i(1+i)^n}{(1+i)^n -1}=160,000\cdot\frac{0.00479(1+0.00479)^{360}}{(1+0.00479)^{360} -1}=\$933.72\]

  1. The interest to be paid equals the the total of all 360 monthly payments minus the amount to be amortized; that is, \(360\cdot M-P=360\cdot 933.72 - 160,000=\$176,138\).

Example 26.

Ken just bought a house. He made a $25,000 down payment and financed the balance with a 20-year home mortgage loan with an interest rate of 5.5% compounded monthly. His monthly mortgage payment is $950. What was the selling price of the house?

Solution.

Given: \(n = 12\cdot 20=240\), \(r = 0.055\), \(M=950\), and thus the monthly interest rate \(i = \frac{r}{12}=\frac{0.055}{12}=0.00458\). You want the selling price of the house.

By the amortization formula

\[M=P\cdot\frac{i(1+i)^n}{(1+i)^n -1}\] Solving \(P\) gives

\[P=M\cdot\frac{(1+i)^n -1}{i(1+i)^n}\] Plugging numbers

\[P=M\cdot\frac{(1+i)^n -1}{i(1+i)^n}=950\cdot\frac{(1+0.00458)^{240} -1}{0.00458(1+0.00458)^{240}}=138,104\] The selling price is the sum of the down payment and the amount that was amortized, i.e., \(25,000+138,104=\$163,104\).

A mortgage is an installment loan that helps you buy or build a home, condo, townhouse or some other kind of property.

The biggest chunk of your monthly mortgage payment will go toward the loan principal (the amount you borrowed) and interest. The payment also goes toward homeowners insurance and property taxes, as well as mortgage insurance and Homeowners Association (HOA) dues, if applicable. You can use some online calculators to get an estimate of how much you’ll pay each month based on how much you’re borrowing and at what rate, how much you’re putting down and how long you have to pay off the loan.

Suggested Exercises from Textbook

10, 16, 20, 22, 24, 26, 32, 38, 48, 54, 58, 62, 64, 65

Example 27 (Exercise 65).

Elizabeth went on a fabulous vacation in May and racked up a lot of charges on her credit card. When it came time to pay her June credit card bill, she left a balance of $1200. Elizabeth’s credit card billing cycle runs from the nineteenth of each month to the eighteenth of the next month, and her interest rate is 19.5%. She started the billing cycle June 19-July 18 with a previous balance of $1200. In addition, she made three purchases, with the dates and amounts shown in Table 10-11. On July 15 she made an online payment of $500 that was credited to her balance the same day.

  1. Find the average daily balance on the credit card account for the billing cycle June 19 -July 18.

  2. Compute the interest charged for the billing cycle June 19-July 18.

  3. Find the new balance on the account at the end of the June 19 -July 18 billing cycle.

New Balance Calculation
Items Amounts Number of Days to Next Due Date Number of Years to Next Due Date r (APR) Product of Columns 2, 4, 5 (Interest)
6/19 1200.00 30 0.0821918 0.195 19.2328767
6/21 179.58 28 0.0767123 0.195 2.6863200
6/30 40.00 19 0.0520548 0.195 0.4060274
7/5 98.35 14 0.0383562 0.195 0.7356041
7/15 -500.00 4 0.0109589 0.195 -1.0684932

Solution.

  1. \((1200\cdot 30 + 179.58\cdot 28 + 40.00\cdot 19 + 98.35\cdot 14 - 500\cdot 4)/30=\$1,372.17\)

  2. \(19.2329 + 2.6863 + 0.4060 + 0.7356 - 1.0685=\$21.99\).

  3. The new balance at the end of the June 19 - July 18 billing cycle is \(1200+179.58+40+98.35-500+21.99=\$1039.92\).

Chapter 14. Census, Surveys, Polls, and Studies

Consider the following questions:

To answer each of the questions, data need to be collected and analyzed. The first three questions can be formulated as estimation problems, while the other two as hypotheses testing problems.

Enumeration

Enumeration is a process of counting the number of indivisible and distinguishable individuals (people, animals, plants, machines, lakes, etc.) of interest. A collection of such individuals is called a population. The number of individuals in a population is called the size (or \(N\)-value) of it.

There are two ways to find the size of a population:

  • Census. Do a complete head count.

  • Survey. Do only a partial head count and use this partial information to estimate the size.

Example 1.

You are interested in the number of students in your math 101 class who have been to UK.

The population is the set of all students in your math 101 class who have been to UK. To find the size of this population, you can conduct a census of these students in your math 101 class. To do so, you ask each student in your math 101 class. Each time you find a student who has been to UK, set them aside and add one to your tally. When you have asked each student in the class you have completed the census.

In practice, it is not always possible to go through each individuals in the population in order to find its size. For example, it is impossible to count the number of walleyes of length 10 inches or longer in lake George. In such situation, we would do a survey to take a sample from the population.

We introduce some concepts first.

  • Survey. A strategy to collect data from a population in order to draw inferences about a it.

  • Sample. A subset of the population chosen (usually randomly) to be the providers of information in a survey. We use \(n\) to denote the size of a sample.

  • Sampling. The act of selecting a sample.

  • Parameter. A true measurement of some characteristic of a population.

  • Statistic. A numerical estimate of a parameter obtained from a sample.

In general, a parameter is an unknown quantity, and a statistic is an educated guess as to what that unknown quantity might be.

Example 2.

How can we know the number of defective lightbulbs in a batch of 100,000?

Here the general population is the batch of 100,000 lightbulbs. The set of all defective lightbulbs in the batch is a subpopulation. We are trying to count the subpopulation.

It would be unrealistic to use an enumeration method. Instead, we can randomly choose a sample of say \(n\) lightbulbs and test them one by one. Let \(k\) denote the defective lightbulbs among these \(n\) ones. Let’s use \(P\) to denote the size of the general population. Here \(P=100,000\).

It is reasonable to assume that the sample is representative, meaning that the proportion of defective lightbulbs in the sample (called the sample proportion)is the same as the proportion of defective lightbulbs in the population (the population proportion). Then, we have the equation:

\[\frac{k}{n}=\frac{N}{P}\] Solving for \(N\) gives

\[N = \frac{k}{n}\cdot P\] If \(n=500\) and \(k=12\), then the above equation gives \(N=\frac{k}{n}\cdot P =\frac{12}{500}\cdot 100,000=2400\).

This is an example of one-sample estimation, since it involves only one sample from a population. The \(N\) is the parameter and \(\frac{k}{n}\cdot P\) is a statistic.

Example 2.

Madison County has a population of 34,522 people. The county hospital is interested in estimating the number of people in the county with blood-type A-. To do this they test blood samples from 253 patients. Out of this group, 17 have blood-type A-. Use this sample to estimate the number of people in Madison County with blood-type A-.

Solution.

Among the 253 patients, the proportion of patients who have blood type A- is \(\frac{17}{253}\). This is the sample proportion. The population proportion is \(\frac{N}{34,522}\).

Under the assumption that the sample is representative, the two proportions are more or less the same; that is,

\[\frac{17}{253}=\frac{N}{34,522}\] Solving the equation gives \(N=\frac{17}{253}\cdot34522=2319.66\) or 2320 (rounded to the nearest tens.

When The general population size \(P\) is unknown, how can we estimate the N-value of a subpopulation?

The following example shows a two-sample estimation method, the capture-recapture method, for dealing with this situation.

Example 3.

How many fish of length 10 inches or longer are there in lake George?

Solution.

The capture-recapture method works as follows.

  • Step 1 (the capture). Capture a sample of fish and denote the number of fish of length 10 inches or longer by \(n_1\). Tag them all and release. Now, you remember that a proportion \(\frac{n_1}{N}\) of all fish of length 10 inches or longer in the lake are tagged.

  • Step 2 (the recapture). After waiting a while (such as one day), you capture a second sample and count the number of fish of length 10 inches or longer that have tags. Denote the size of this second sample by \(n_2\) and the number of tagged by \(k\).

Let’s make the working assumption: The percentage of tagged fish of length 10 inches or longer in the second sample is roughly the same as the percentage of tagged fish of length 10 inches or longer in the lake. Then, we have

\[\frac{k}{n_2}=\frac{n_1}{N}\] Solving gives \(N = \frac{n_1\cdot n_2}{k}\).

Further assumptions need to be made for the capture-recapture method to work:

  • The chances being captured are the same for all fish of of length 10 inches or longer in the lake.

  • The chances being captured are the same for all fish of of length 10 inches or longer that are tagged or not.

  • The general population remains the same between the capture and recapture.

  • The tags do not come off.

Example 4.

You want to estimate how many fish there are in a small pond. Let’s suppose that you first capture 500 fish, tag them, and throw them back into the pond. After a couple of days you go back to the pond and capture 120 fish, of which 30 are tagged. Estimate the number of fish in the pond.

Solution.

  • Step 1 (the capture). \(n_1=500\) fish are tagged and released. Now, you remember that a proportion \(\frac{n_1}{N}\) of all fish in the lake are tagged.

  • Step 2 (the recapture). After a couple of days, you capture a second sample of size \(n_2 = 120\) and the number of tagged is \(k=30\).

Make the working assumption: The percentage of tagged fish in the second sample is roughly the same as the percentage of tagged fish in the lake. Then, we have

\[\frac{k}{n_2}=\frac{n_1}{N}\] Plugging numbers

\[\frac{30}{120}=\frac{500}{N}\] Solving the equation gives \(N=2000\).

Chapman’s correction of the capture-recapture formula:

Using the same three input variables (size of the first sample), (size of the second sample), and ( number of tagged individuals in the second sample), Chapman’s correction is given by the equation

\[\frac{k+1}{n_2+1}=\frac{n_1+1}{N+1}\] Solving the equation gives \(N=\frac{(n_1+1)(n_2+1)}{k}-1\).

Example 4.

A biologist wants to estimate the size of a population of turtles in a lake. She captures 10 turtles on her first visit to the lake, and marks their backs with paint. A week later she returns to the lake and captures 15 turtles. Five of these 15 turtles have paint on their backs, indicating that they are recaptured animals. Estimate N, the total number of turtles in the lake.

Solution.

This example has \((n_1, n_2, k) = (10, 15, 5)\). Want \(N\).

\[N=\frac{(n_1+1)(n_2+1)}{k}-1=\frac{(10+1)(15+1)}{5}-1=35.2\] The total number of turtles in the lake is estimated to be 152 with the Chapman method.

Measurement

Enumeration is a special case of a more general problem called measurement. It measures the size (\(N\)-value) of a population. In practice, rather than counting something, one may need to “measure” the average number of times all incoming freshmen at a large university visited the university’s web page.

The typical way to answer a measurement problem is to conduct survey (also called a poll when the measurements require asking questions and recording answers). A survey has three steps:

  • Sampling, i.e., choosing the sample or samples

  • “Measuring” the individuals in the sample(s)

  • Drawing inferences about the population from the measurements

Two Sources of bias:

Depending on the sampling procedure, some samples are good meaning that they are representative of the population and are large enough to cover the variability of in the population. Other sample are bad meaning that they are biased and consequently will produce a biased estimate for parameters of interest.

When the selection of a sample has a systematic tendency to consistently favor certain group of individuals within the population over others, we say that the survey suffer from selection bias (usually caused by under coverage). Polling people online creates a biased sample due to selecting who have an access to internet or who like surfing online, while other people have small or no chance of being selected. Sampling like these are called convenience sampling. While the general population (called the target population) consists of all people, people who surf online (called the sampling frame, the group from which a sample is actually selected) only constitute a subpopulation of the general one.

The second issue of a survey is non-response bias which occurs when the response rate to a survey is low.

In some situations, systematic effort has been made to force the sample to be representative of a population through the use of quotas. This is called quota sampling, a human selection procedure. The best alternative to to human selection is to let chance determine the selection of a sample. Sampling methods that use randomness as part of their design are known as random sampling. Any sample obtained through random sampling is called a random sample or a probability sample.

There are a few random sampling methods:

  • Simple random sampling. This is a random sampling method in which any sample of the same size has equal chance of being selected.

  • Stratified sampling. This is a random sampling method in which the population is first stratified into different groups called strata and then choose a small sample by simple random sampling from each stratum before pooling all of them to form the desired sample.

  • Systematic sampling. This is a random sampling method that can be used only when the individuals in population have been listed in numerical order. The first member of the sample is selected at random. The other ones are selected at fixed, regular intervals from the first one. For example, if the first member of the sample is the 4th in the list, then the rest might be numbered 11, 18, 25, 32, …

There is also nonrandom sampling methods, such as quota sampling, which is a systematic effort to force the sample to be representative of a given population through the use of quotas- the sample should have so many women, so many men, so many blacks, so may whites, so many people living in urban areas, so many people living in rural areas, and so on. The proportions in each category in the sample should be the same as those in the population.

No sampling method is perfect. The positive difference between the parameter and its estimate (the statistic) obtained from a sample is called the sampling error.

A sample based on a sampling method gives an estimate of a parameter. If you were to redo the sampling using the same sampling method, you would obtain a different estimate for the parameter. This natural variability is called the sampling variability.

Example 5.

The city of Cleansburg has 8325 registered voters. There is an election for mayor of Cleansburg, and there are three candidates for the position: Smith, Jones, and Brown. The day before the election a telephone poll of 680 randomly chosen registered voters produced the following results: 306 people surveyed indicated that they would vote for Smith, 272 indicated that they would vote for Jones and 102 indicated that they would vote for Brown.

  1. Describe the population for this survey.
  2. Describe the sample for this survey.
  3. Name the sampling method used for this survey.
  4. Given that in the actual election Smith received 42% of the vote, Jones 43% of the vote, and Brown 15% of the vote, find the sampling errors in the survey expressed as percentages.
  5. Do you think that the sampling error in this example was due primarily to sampling bias or to chance? Explain your answer

Solution.

  1. The populations is all the registered voters in the city of Cleansburg.

  2. The sample is the 680 randomly polled registered voters from the city

  3. The simple random sampling method was used for this survey.

  4. The estimated proportion (sample proportion) for Smith is \(\frac{306}{680}\) or 0.45. The estimated proportion for ones is \(\frac{272}{680}\) or 0.40. The estimated proportion (sample proportion) for Brown is \(\frac{102}{680}\) or 0.15. So, the sampling errors (positive difference between the parameter and its estimate) are 3%, 3%, and 0%, respectively.

  5. The sampling error should be primarily attributed to chance, since the sample was chosen at random, which eliminates selection bias, and there is no non-response, which eliminates nonresponse bias.

Example 6.

The 1250 students at Eureka High School are having an election for Homecoming King. The candidates are Tomlinson (captain of the football team), Garcia (class president), and Marsalis (member of the marching band). At the football game a week before the election, a pre-election poll was taken of students as they entered the stadium gates. Of the students who attended the game, 203 planned to vote for Tomlinson, 42 planned to vote for Garcia, and 105 planned to vote for Marsalis.

  1. Describe the population for this survey.
  2. Describe the sample for this survey.
  3. Give the proportion of students who are selected in this survey.
  4. Name the sampling method used for this survey.
  5. Compare and contrast the population and the sampling frame for this survey.
  6. Is the sampling error a result of sampling variability or of sampling bias? Explain.
  7. Give the sample proportions estimating the percentage of the vote going to each candidate.
  8. A week after this survey, Garcia was elected Homecoming King with 51% of the vote, Marsalis got 30% of the vote, and Tomlinson came in last with 19% of the vote. Find the sampling errors in the survey expressed as percentages.

Solution.

  1. The population for this survey consists of all 1250 students at Eureka High School.

  2. The sample for this survey consists of all the 350 students who attended the football game the week prior to the election for Homecoming King.

  3. The proportion of students who are selected in this survey is \(\frac{350}{1250}\) or 28%.

  4. The sampling method used for this survey is conveniece sampling.

  5. The population for this survey consists of all 1250 students at Eureka High School, whereas the sampling frame consists of those 350 students who attended the football game the week before the election for Homecoming King.

  6. The sampling error is mainly a result of sampling bias. The samling frame is not representative of the population.

  7. The sample proportions for this survey are \(\frac{203}{350}\) or 58% for Tomlinson, \(\frac{42}{350}\) or 12% for Garcia, and \(\frac{105}{350}\) or 30% for Marsalis.

  8. The sampling errors are \(58\%-19\%=39%\) for Tomlinson, \(51\%-12\%=39%\) for Garcia, and \(30\%-30\%=0%\) for Marsalis, respectively

Example 6.

The Cleans burg Planning Department is trying to determine what percent of the people in the city want to spend public funds to revitalize the downtown mall. To do so, the department decides to conduct a survey. Five professional interviewers are hired. Each interviewer is asked to pick a street corner of his or her choice within the city limits, and every day between 4: 00 P.M. and 6: 00 P. the interviewers are supposed to ask each passerby if he or she wishes to respond to a survey sponsored by Cleansburg City Hall. If the response is yes, the follow-up question is asked: Are you in favor of spending public funds to revitalize the downtown mall? The interviewers are asked to return to the same street corner as many days as are necessary until each has conducted a total of 100 interviews. The results of the survey are shown in Table

Summary of Survey
Interviewer Yes No Nonrespondents
A 35 65 321
B 21 79 208
C 58 42 103
D 78 22 87
E 12 63 594
Note:
Data from textbook.
  1. What is the size of the sample?
  2. Calculate the response rate in this survey. Was this survey subject to nonresponse bias?
  3. Can you explain the big difference in the data from interviewer to interviewer?
  4. One of the interviewers conducted the interviews at a street corner downtown. Which interviewer? Explain.
  5. Do you think the survey was subject to selection bias? Explain.

Solutiion.

  1. The size of the sample is 475. It’s obtained by adding all the numbers on the second and third columns.

  2. Since the number of nonrespondents is 1313 (adding numbers in the 4th column), the total number of people initially planned to be interviewed is \(475 + 1313\) or 1788. Thus, the response rate is \(\frac{475}{1788}\) or 26.57%. Yes, the survey is subject to nonresponse bias.

  3. Based on the responses, for interviewer \(A\), the proportion for \(Yes\) is 35%, for $B, 21%, for \(C\), 58%, for \(D\), 78%, and for \(E\), 16%. There is a huge difference among these proportions. The choice of street corner could make a great deal of these differences in the response collected.

  4. Interview \(D\).We are expecting that people who live or work downtown are much more likely to answer yes than people in other parts of town.

  5. I think the survey was subject to selection bias. There are two main reasons: (1) People out on the street between 4:00 P.M. and 6:00 P.M. are not representative of the population at large. For example, office and white-collar workers are much more likely to be in the sample than home makers and school teachers. (2) The five street corners were chosen by the interviewers, and the passersby are unlikely to represent a cross section of the city.

Example 7.

Refer to the following story: An orange grower wishes to compute the average yield from his orchard. The orchard contains three varieties of trees: 50% of his trees are of variety A, 25% of variety B and 25% of variety C.

  1. Suppose that the grower samples randomly from 300 trees of variety A, 150 trees of variety B and 150 trees of variety C. What type of sampling is being used?

  2. Suppose that the grower selects for his sample a 10 by 30 rectangular block of 300 trees of variety A, a 10 by 15 rectangular block of 150 trees of variety B, and a 10 by 15 rectangular block of 150 trees of variety C. What type of sampling is being used?

Solution.

  1. Stratified sampling, since a sample is selected from each stratum.

  2. Quota sampling, since the proportions in the sample matches those in the population.

The two situations are different. In (a), the 300 A trees randomly selected may not be in the same block, while in (b), the 300 trees are in the same block, and thus nothing is random.

Cause and Effect

Does smoking cause lung cancer? Does Exercising reduce weight? Does Coffee make you live longer? All of these are cause-and-effect questions.

A typical cause-and-effect statement takes the form \(X\) causes \(Y\). Causation (or causal relationship) is different from association (or correlation). A association between two things occurs when there is a mutual relationship between them, and association does not imply causation.

Causation can only be established through randomized controlled studies. In a randomized controlled study the subjects are randomly divided into groups: the treatment group(s) and the control group(s). We focus on one treatment and one control. The individuals in the treatment group receive the treatment and individuals in the control group receives the control (a fake treatment, sometimes called placebo) The randomization ensures that other causes (called the confounding variables) that could explain the same effect are balanced between the two groups so that when there is a difference in the results between the two groups, it is attributed only to the difference between the treatment and the control.

A double-blind study is one in which neither the researcher nor the the subjects know what treatments are being received.

An observational study is one in which no treatment is applied to any individual involved. When studying the relationship between two variables, the true relationship might be masked by other variables called confounding variables, since such variables are associated with each of the two variables being considered.

Example 7.

The manufacturer of a new vitamin (vitamin \(X\) ) decides to sponsor a study to determine the vitamin’s effectiveness in curing the common cold. Five hundred college students having a cold were recruited from colleges in the San Diego area and were paid to participate as subjects in this study. The subjects were each given two tablets of vitamin \(X\) a day. Based on information provided by the subjects themselves, 457 of the 500 subjects were cured of their colds within 3 days. (The average number of days a cold lasts is 4.87 days.) As a result of this study, the manufacturer launched an advertising campaign based on the claim that “vitamin \(X\) is more than 90% effective in curing the common cold.” (a) Describe as specifically as you can the target population for the study. (b) Describe the sampling frame for the study. (c) Describe the sample used for the study. (d) Was the study a controlled study? Explain. (e) List four possible causes other than the effectiveness of vitamin X itself that could have confounded the results of the study. (f) List four different problems with the study that indicate poor design. (g) Make some suggestions for improving the study.

Solution.

  1. The target population consists of anyone who could have a cold and would consider buying vitamin \(X\).

  2. The sampling frame consists of college students in the San Diego area have a cold at the time.

  3. The sample consisted of the 500 students that took vitamin \(X\).

  4. No, there was no control group.

  5. College students in the San Diego area are not typical of the population at large in several critical aspects (age, health, exposure to inclement weather, etc.); The volunteer were paid to participate; The subjects themselves determine the length of their cold; There was no control group.

    1. Using college students, a biased sample. (2) Using subjects only from the San Diego area. (3) Offering money as an incentive to participate. (4) Allow self reporting.
    1. Choose the subjects at random from the population at large. (2) Divide the subjects randomly into a treatment group (getting vitamin \(X\)) and a control group (getting a fake pill). (3) Have trained professionals (nurses) measure the length of each subject’s cold. (4) Neither the investigator nor the subjects should know who is getting vitamin \(X\) and who is getting a placebo.

Chapter 15. Graphs, Charts, and Numbers

A picture is worth 1000 words.

Visit the webpage: https://blog.prepscholar.com/act-scores-by-state-averages-highs-and-lows. How can you display the data in the second table using a graph?

Data are measurements or observations that are collected as a source of information. Data are usually organized in a table of rows and columns, with rows being cases (individuals, records, or observations) and columns being variables (fields, characteristics or attributes).

A variable can be quantitative or qualitative (or categorical).

A quantitative variable is one in which values of the variable can be added. A quantitative variable can be discrete or continuous. A discrete variable is one in which two values of the variable cannot be arbitrarily close to each other. A continuous variable, on the other hand, is one in which two values of the variable can be arbitrarily close to each other. For example, distance, height, weight, and volume are examples of continuous variables, while test score and the number of siblings are examples of discrete variable.

A quantitative (or categorical) variable is one in which different values of the variable represent different categories. Examples are color, political party affiliation, and nationality.

Describing Categorical Data with Bar Graphs or Pie Charts

When categorical data are available for analysis, they are first summarized with a frequency table, a table with categories and frequencies. The following is a frequency table based on the final grades of students in a math 101 class: A, B, C, B, C, A, D, B, C, B, B, A, F, D, C, A, B, C, B, B, D, B, A, C, B, B, C, A, F, B, A, C.

Distribution of Grades for Math 101
Grade Frequency
A 7
B 12
C 8
D 3
E 2
Note:
This is based on fake data.

Now, we are ready to display the frequency table with a bar graph. A bar graph is a set of named bars (vertical or horizontal) with the heigh of each bar representing the corresponding frequency.

Bar graph can also be used to plot the summary statistics on different items.

How Do College Students Spend Their Day?
Activity Time..in.hours.
6 Grooming 0.8
5 Eat/Drink 1.0
7 Travel 1.4
3 Work 2.4
4 Education 3.3
2 Leisure/Sports 4.1
1 Sleep 8.7
8 Other 2.3
Note:
This is based on book Example 15.4 data.

Can we make a pie chart for the data? Absolutely no, since the numbers do not represent proportions that add up to 100%.

Describing Quantitative Data with Histograms or Boxplots

Given the final exam score of students in Math 101: 77, 86, 80, 94, 79, 93, 83, 72, 77, 88, 70, 86, 89, 82, 78, 81, 82, 86, 83, 92, 85, 89, 77, 97, 74, 81, 73, 81, 86, 75, 73, 85, 81, 80, 67, 93

We can make score intervals (65, 70], (70, 75], (75, 80], (80, 85], (85, 90], (90, 95], (95, 100] so that they are consecutive, not overlapping, and there are no gaps in between. Then, count the number of scores in each of these intervals. This gives the following frequency table:

Frequency Table of Final Scores of Students in Math 101
Intervals Frequency
(65, 70] 2
(70, 75] 5
(75, 80] 7
(80, 85] 10
(85, 90] 7
(90, 95] 4
(95, 100] 1
Note:
This is based on book Example 15.4 data.

The lower bound of the first interval should be slightly smaller than the smallest value in the data. The upper bound of the last interval should be slightly bigger than the largest value in the data. A rule of thumb for the number of intervals is 5 to 25.

Now, we can display the frequency in a graph called the histogram.

A histogram has consecutive bars without gap in between, while a bar graph consists of bars with gaps in between. Histograms ares for quantitative data and bar graphs are categorical data.

To examine a histogram, pay attention to whether it is symmetric, right-skewed, or left-skewed. A right-skewed histogram has longer tails extended to right. A left-skewed histogram has longer tails extended to left.

The above three histograms give the distributions of Students in different Math 101 sections taught by different professors. Which class is the easiest? Which one is the hardest?

Describing Time Series Data with Line Graphs

If data values are recorded with time, the data are called time series data. Time series data can be displayed in a line graph. The table below shows the percentage of U.S. workers who are members of unions (2000-2021)

percentage of U.S. workers who are members of unions (2000-2021)
Year Percent
1983 20.1
1984 18.8
1985 18.0
1986 17.5
1987 17.0
1988 16.8
1989 16.4
1990 16.0
1991 16.0
1992 15.7
1993 15.7
1994 15.5
1995 14.9
1996 14.5
1997 14.1
1998 13.9
1999 13.9
2000 13.4
2001 13.3
2002 13.3
2003 12.9
2004 12.5
2005 12.5
2006 12.0
2007 12.1
2008 12.4
2009 12.3
2010 11.9
2011 11.8
2012 11.3
2013 11.3
2014 11.1
2015 11.1
2016 10.7
2017 10.7
2018 10.5
2019 10.3
2020 10.8
2021 10.3
Note:
T

The data show that union membership has been declining since the 1980s.

Example 1.

The chief financial officer (CFO) of a music download site has just secured the rights to offer downloads of a new album. To see how well it’s selling, she collects the number of downloads per hour for the past 24 hours:

Frequency Table of Final Scores of Students in Math 101
Hour Downloads
12:00 a.m. 36
1:00 a.m. 28
2:00 a.m. 19
3:00 a.m. 10
4:00 a.m. 5
5:00 a.m. 3
6:00 a.m. 2
7:00 a.m. 6
8:00 a.m. 12
9:00 a.m. 14
10:00 a.m. 20
11:00 a.m. 18
12:00 p.m. 25
1:00 p.m. 22
2:00 p.m. 17
3:00 p.m. 18
4:00 p.m. 20
5:00 p.m. 23
6:00 p.m. 21
7:00 p.m. 18
8:00 p.m. 24
9:00 p.m. 30
10:00 p.m. 27
11:00 p.m. 30
Note:
This is based on book Example 15.4 data.

CFO: I can’t really tell how many downloads are happening each day. Can you find a way to help me better understand the number of downloads?

Analyst: I got the following histogram. It shows that, for most hours of the day, we get between 15 and 30 downloads.

CFO: Are there any interesting patterns in the hourly download data?

Analyst: I created a different visual.

Surprisingly, for this day, downloads were highest at midnight with about 36 downloads per hour, then dropped sharply until about 5-6 AM, when they reached their minimum at 2-3 per hour. They gradually increased to about 20 per hour by noon, and then stayed in the twenties until midnight, with a slight increase during the evening hours. When we ignored the time order, as we did earlier, we missed this pattern entirely.

The above example is taken from Business Statistics by Sharpe.

Means, Medians, and Percentiles

The average of a set of quantitative values is the sum of these values divided by the number of values. The average is also known as the mean. The mean tells roughly how large a typical value is.

Example 2.

Tom’s scores for the first 10 quizzes in Math 101 are 8, 7, 9, 10, 9, 10, 8, 7, 10, and 8. Find the mean score.

Solution.

Adding the 10 values gives 86. Dividing this sum by the number of values gives 8.6, the mean.

Example 3.

There are 50 students in Dr. Robinson’s Stat 101 class. The number of siblings of each student is collected and the numbers are summarized in a frequency table as shown below:

Frequency Table of Number of Siblings of Students in Stat 101
Number.of.Siblings Frequency
0 5
1 18
2 15
3 7
4 3
5 1
6 1
Note:
This is fake data.

Find the mean number of siblings for students in this class.

Solution.

The data show that there are 5 zeros, 18 ones, 10 twos, 6 threes, 3 fours, 1 five, and 1 six. The sum of the 50 values is 92, dividing by 50 gives the mean 1.84.

Some times we want a cutoff of our data. Such cutoff will separate some values from other values. Those values that are no greater than the cutoff correspond to certain percent of all values in the data. For example, if we use a cutoff of 23 for the data 6, 6, 9, 10, 10, 18, 18, 21, 31, 31, then 80% of the data are no greater than 23. We call 23 the 80th percentile.

The steps of finding a \(p\)-th percentile out of \(N\) values is given below:

  • Step 1: Sort the values in ascending order.

  • Step 2: Find the locator \(L = \frac{p}{100}\cdot N\).

  • Step 3: Get the percentile. There are two cases. + If \(L\) is an integer, the percentile is the average of the \(L\)-th and the \((L+1)\)-th value in the sorted data. + If \(L\) is NOT an integer, round it up to the next integer called \(K\). The percentile is the \(K\)-th value in the sorted data.

Example 4.

To reward good academic performance from its athletes, ABC University has a program in which any athlete with a GPA in the 80th or higher percentile of their team’s GPA’s get a $5000 scholarship and any athlete with a GPA in the 55th or higher percentile of their team’s GPA’s who did not get get the $5000 scholarship gets a $2000 scholarship.

The women’s volleyball team has \(N=15\) players on the roster. A list of their GPA’s is as follows:

3.42, 3.91, 3.33, 3.65, 3.57, 3.45, 4.00, 3.71, 3.35, 3.82, 3.67, 3.88, 3.76, 3.41, 3.62

Find the 80th and 55th percentiles, which are cutoffs for the two scholarship, respectively.

Solution.

To find the 80th percentile for the $5000 scholarship, we take the following steps.

Step 1: Sort the data from smallest to largest. This gives 3.33, 3.35, 3.41, 3.42, 3.45, 3.57, 3.62, 3.65, 3.67, 3.71, 3.76, 3.82, 3.88, 3.91, 4.00

Step 2: Find the locator L. \(L = \frac{p}{100}\cdot N=\frac{80}{100}\cdot 15=12\). So, the 80th percentile is the average of the 12th and 13th value in the sorted data, or \((3.82+3.88)/2=3.85\).

To find the 55th percentile for the $2000 scholarship, we take the following steps.

Step 1: Sort the data from smallest to largest. This gives 3.33, 3.35, 3.41, 3.42, 3.45, 3.57, 3.62, 3.65, 3.67, 3.71, 3.76, 3.82, 3.88, 3.91, 4.00

Step 2: Find the locator L. \(L = \frac{p}{100}\cdot N=\frac{55}{100}\cdot 15=8.25\). Round it up to 9. So, the 55th percentile is the 9th value in the sorted data, or 3.67.

There are a few special percentiles: the 50th percentile (also known as median, denoted by \(M\)), the 25th percentile (or first quartile or lower quartile, denoted by \(Q_1\)), and the 75th percentile (third quartile or upper quartile, denoted by \(Q_3\)).

The median is just the very middle value of the sorted data. If there are two values in the middle of the sorted data, use the average of them to get the median.

For any quantitative data, we can now find its minimum (denoted by \(Min\)), \(Q_1\), \(M\), \(Q_3\), and the maximum (denoted by \(Max\)). The five values are called the Five-number summary of the data. These values can be displayed in a graph called the box plot.

For the GPA data just used, the five-number summary is: 3.33, 3.42 ,3.65, 3.82, 4.00.

The box plot is

The box plots are better used for comparisons of multiple groups.

Section B’s scores have a higher median (close to 80) and smaller spread (or variability) than section A (median below 70), so section B performs better than section A.

Both mean and median measure the central tendency of data. Since the mean can be affected by extremely large or small values (called outliers) in data, it is recommended to use median rather than mean when describing the center of data.

Ranges and Standard Deviations

The spread of data can be described by the range, interquartile range, and standard deviation of the data. We now introduce each of them.

The range (denoted by \(R\)) of data is the difference between the maximum and the minimum.

The interquartile range (denoted by \(IQR\)) of data is the difference between the first and the third quartiles.

The five-number summary for the GPA data discussed before is: 3.33, 3.42 ,3.65, 3.82, 4.00, so the range of the data is \(4.00 - 3.33\) or 0.67. The interrquartile range is \(3.82 - 3.42\) or 0.4.

The (sample) variance of a set of quantitative data is defined as follows:

Given \(n\) numerical values \(x_1, x_2, \cdots, x_n\), the variance is denoted by \(s^2\) and is given by \[s^2 = \frac{(x_1 -\bar{x})^2+(x_2 -\bar{x})^2+\cdots+(x_n -\bar{x})^2}{n-1}\] where \(\bar{x}\) is the mean of the data and \(x_1 -\bar{x}\) is called the deviation of \(x_1\) from the mean.

A note: The textbook uses \(n\), the number of values, as the denominator when calculating the variance of a sample. We will use \(n-1\) instead. When using MyLabMath, the practice quizzes use \(n\) instead of \(n-1\). When you treat the data as a population, the denominator is n. Excel actually distinguishes the two. For any quiz or the final exam, use \(n-1\) when calculating the variance of a sample.

The standard deviation (denoted by \(s\)), which is the square root of the variance, is the most often used measure of spread of the data.

Example 5.

The data below are the numbers of steps Tom walked in 7 consecutive days. 12000, 10280, 9820, 15000, 8605

Find the

  1. mean

  2. standard deviation (rounded to four decimal places)

  3. \(z\)-scores (rounded to two decimal places). The \(z\)-score or standard score of a value is the deviation of this value from the mean, divided by the standard deviation, so a \(z\)-score is just deviation measured in standard deviation unit.

Solution.

  1. 11141

  2. The variance \[s^2=\frac{(12000-11141)^2 + (10280-11141)^2 + (9820-11141)^ 2 + (15000-11141)^2 + (8605-11141)^2}{5-1}=6136855,\] and the standard deviation \(s = \sqrt(6136855)=2477.2680\)

  3. The \(z\)-score for 12000 is calculated by \(\frac{12000-11141}{2477.2680}\), which is 0.35. Other \(z\)-scores are calculated in the same way, and are -0.35, -0.53, 1.56, -1.02, respectively. The first \(z\)-score (0.35) means that the value 12000 is 0.35 standard deviation above the mean. The second \(z\)-score (-0.35) means that the value 10280 is 0.35 standard deviation below the mean. Other \(z\)-scores are interpreted similarly.

Do It Yourself

  1. Using the data from the second graph: https://usafacts.org/articles/labor-union-membership/ and following the video: https://www.youtube.com/watch?v=b0H6azi9wUM

Write a few sentences of summaries for the graph.

  1. Watch the video: https://www.youtube.com/watch?v=IaWATD_FvAc. Use the fertility rate available here https://www.cdc.gov/nchs/pressroom/sosmap/fertility_rate/fertility_rates.htm to calculate the mean, median, range, and standard deviation using Microsoft Excel.

Chapter 16. Probabilities, Odds, and Expectations

Consider the problems:

In a typical 6/49 game, each player chooses six distinct numbers from a range of 1-49. If the six numbers on a ticket match the numbers drawn by the lottery, the ticket holder is a jackpot winner—regardless of the order of the numbers. What is the chance of this happening? What are the odds of this happening?

Let’s say a ticket costs $3, and you have a 0.00000007 chance of winning $10 million dollars — should you buy one?

We will introduce tools that help us solve these problems and other chance-related problems. We first introduce some basic concepts.

Sample Spaces and Events

A random experiment is an activity or process whose outcome cannot be predicted ahead of time.

A sample space (denoted by \(S\)) is the set of all possible outcomes of a random experiment.

An event is any subset of the sample space. An important event is the event that never happens, while a certain (or sure) event is an event that always happens. We use capital letters to denote events, such as \(A\), \(B\), and \(C\).

A simple event is an event that consists of a single outcome.

Example 1.

Flip a coin once.

  1. Explain why this a random experiment.

  2. What is a sample space you can specify?

  3. How many possible events are there? What each represent?

  4. How many simple events are there?

Solution.

  1. This is a random experiment, since it is an activity whose outcome cannot be predicted ahead of time.

  2. A possible sample space could be \(S=\{H, T\}\), where “H” represents a head and “T” represents a tail.

  3. Since the sample space, as a set, has four subsets: \(\{~~\}\), \(\{H\}\), \(\{T\}\), \(\{H, T\}\), there are four events. Specifically, \(\{~~\}\) represents the impossible event, \(\{H\}\) represents the event of flipping a head, \(\{T\}\) represents the event of flipping a tail, and \(\{H, T\}\) represents the event of flipping a head or tail.

  1. A simple event, as a set, constains only one possible outcome. So, there are only two simple events: \(\{H\}\) and \(\{T\}\).

Example 2.

Flip a coin twice.

  1. Explain why this a random experiment.

  2. What is a sample space you can specify?

  3. How many possible events are there? What each represent?

  4. How many simple events are there?

Solution.

  1. This is a random experiment, since it is an activity whose outcome cannot be predicted ahead of time.

  2. A possible sample space could be \(S=\{HH, HT, TH, TT\}\), where “HH” represents a two heads in a row and “HT” represents a head followed by a tail.

  3. Since the sample space, as a set, has 16 subsets: \(\{~~\}\), \(\{HH\}\), \(\{HT\}\), \(\{TH\}\), , \(\{TT\}\), \(\{HH, HT\}\), \(\{HH, TH\}\), \(\{HH, TT\}\), \(\{HT, TH\}\), \(\{HT, TT\}\), \(\{TH, TT\}\), \(\{HH, HT, TH\}\), \(\{HH, HT, TT\}\), \(\{HH, TH, TT\}\), \(\{HT, TH, TT\}\), and \(S=\{HH, HT, TH, TT\}\), there are 16 events. For example, \(\{HH, HT\}\) represents the event of at least one head.

  1. A simple event, as a set, contains only one possible outcome. So, there are four simple events: \(\{HH\}\) and \(\{HT\}\), \(\{TH\}\) and \(\{TT\}\).

The Multiplication Rule of Counting, Permutations, and Combinations

If there are \(M\) different ways to do task \(X\) and for each way \(X\) is done, there are \(n\) different ways task \(Y\) can be done, then \(X\) and \(Y\) together can be done in \(m\cdot n\) different ways. This is called the multiplicative rule of counting. The rule can be extended to multiple tasks.

To arrange 2 people (say Tom and Jerry) to sit at seats 1, 2, 3, or 4, there are 4 ways to arrange Tom. Once Tom is seated, there are 3 ways to arrange Jerry. By the multiplicative rule of counting, there are \(4\cdot 3\) or 12 ways to sit Tom and Jerry. If there are 3 people, the answer would be 24 ways.

In general, to sit \(n\) people on \(n\) seats, there are \(n\cdot (n-1)\cdot(n-2)\cdots\cdot 3\cdot 2\cdot 1\) ways, or denoted by \(n!\). For example, \(1!=1\), \(2!=2\cdot 1=2\),3! = \(3\cdot 2\cdot 1=6\) and 5! = \(5\cdot 4\cdot3\cdot 2\cdot 1=120\). But, \(0!\) is designated to be 1. \(0!\) represents the number of ways to sit 0 people on 0 seat. Since there is no way to do so, it is one way (no way is a way). Another way to understand why \(0!=1\) is the following:

\[\frac{4!}{3!}=\frac{4\cdot3\cdot2\cdot1}{3\cdot2\cdot1}=4\] \[\frac{3!}{2!}=\frac{3\cdot2\cdot1}{2\cdot1}=3\] \[\frac{2!}{1!}=\frac{2\cdot1}{1}=2\] If you want the pattern to continue, \[\frac{1!}{0!}\] has to be 1. That is

\[\frac{1!}{0!}=\frac{1}{0!}=1.\] So, \(0!\) must be 1.

There are two important results:

  • The number of ways to choose \(r\) people from \(n\) people to sit on \(r\) seats (order does matter) equals \(\frac{n!}{(n-r)!}\), denoted by \(_nP_r\).

  • The number of ways to choose \(r\) people from \(n\) people (order does not matter) equals \(\frac{n!}{r!(n-r)!}\), denoted by \(_nC_r\) or \(\binom{n}{r}\). This is also known as the number of combinations of \(r\) different things chosen from a set of \(n\) different ones.

Example 1.

  1. Calculate \(\frac{6!}{4!}\).

  2. Choose 4 from 6 people to sit on four different chairs. How many ways are there?

  3. Choose 3 students from a student club of 6 students and send them to Washington DC to drink coffee with Vice President Kamala Harris. How many ways are possible?

Solution.

  1. \(\frac{6!}{4!}=\frac{6\cdot5\cdot4\cdot3\cdot2\cdot1}{4\cdot3\cdot2\cdot1}=30\) (by canceling common factors).

  2. The answer is \(_6P_4 = \frac{6!}{(6-4)!}=\frac{6!}{2!}=360\). You can also do it by the multiplicative rule of counting: \(6\cdot5\cdot4\cdot3=360\).

  3. The answer is \(\binom{6}{3}=\frac{6!}{3!(6-3)!}=\frac{6\cdot5\cdot4\cdot3\cdot2\cdot1}{(3\cdot2\cdot1)(3\cdot2\cdot1)}=20\) (by canceling common factors).

Probabilities and Odds

Randomly choose a number from the set \(\{1, 2, 3, 4, 5, 6,7, 8, 9, 10\}\). There are 10 possible outcomes, so the sample space is \(S=\{1, 2, 3, 4, 5, 6,7, 8, 9, 10\}\). Each of the ten outcomes are equally likely, so each should have a chance of 1 out of 10. What is the chance of getting 3 or 7? It seems that the answer should be 2 out 10. Yes, it is right! What about the chance of this event {3, 6, 9}? This event represents “getting a number divisible by 3.” The chance as you would tell is 3 out of 10.

In general, when we have a sample space which consists of \(n\) outcomes that are all equally likely, then the chance that a \(k\)-element event happens ought to be \(\frac{k}{n}\). What is the chance of the sure event \(S\)? It’s 1 or 100%, since it is a set containing all possible outcomes and \(k=n\).

Formally, chance is called \(probability\). When a sample space contains \(n\) equally likely outcomes, we can assign a probability of \(\frac{1}{n}\) to each outcome to reflect the fact of equal chance and we also want the total chance to be 1 or 100%.

If \(A\) is an event and it has a probability of 0.3 to happen, then we will write \(P(A)=0.3\), read “The probability of event \(A\) is 0.3.” For any event \(A\), its probability is always between 0 and 1.

If \(A\) is an event, then “\(A\) does not happen” is also an event and it is called the complementary event of \(A\), denoted \(\text{not}~A\). As expected, \(P(\text{not}~A) = 1-P(A)\).

Two events \(A\) and \(B\) are said to be independent, if the probability that both \(A\) and \(B\) happen equals the product of the probability of each; that is, if \(P(A ~\text{and} ~B) = P(A)\cdot P(B)\).

When several events are independent, the probability that they happen simultaneously equals the product of each individual probability.

Example 2.

  1. Let \(A\) denote the event that it is raining tomorrow. If \(P(A)=0.4\), what is the probability that it is not raining tomorrow?

  2. What is the probability of getting a head if you flip a coin only once?

  3. Flip a coin 5 times. What is the probability of HHHHH? What is the probability of THHTT?

Solution.

  1. The probability that it is not raining tomorrow equals \(1-0.4=0.6\).

  2. The probability of getting a head is 0.5.

  3. Since the 5 flips are independent, the probability of HHHHH equals the product of the probability of each of the 5 H’s. Since each H has a probability of 0.5, The probability of HHHHH equals \((0.5)^5\) or 0.03125 or 3.125%. The probability of THHTT is also equals \((0.5)^5\) or 0.03125.

Another measure of chance is to use odds. The odds of an event \(A\) are defined to be the ratio of \(P(A)\) to \(P(\text{not}~A)\); that is \(Odds(A)=\frac{P(A)}{P(\text{not}~A)}\). Odds are usually written as a ratio.

If the odds of event \(A\) are a:b, then the probability of event \(A\) is given by \(\frac{a}{a+b}\).

Example 3.

  1. What are the odds of getting a head when a coin is flipped only once?

  2. What are the odds of getting two heads in a row if a coin is flipped twice?

  3. If the odds of event \(E\) are 3:2, what is the probability of \(A\)?

Solution.

  1. Since the probability of getting a head is 0.5, the odds of getting a head is \(\frac{0.5}{1-0.5}=1\).

  2. The probability of getting two heads equals \(0.5^2\) or 0.25, so the odds of two heads are \(\frac{0.25}{1-0.25}=\frac{0.25}{0.75}=1:3\).

  3. \(P(A)=\frac{3}{2+3}=\frac{3}{5}=0.6\) or 60%.

Random Variables and Expectations

A variable \(X\) is called a random variable if \(X\) takes values with uncertainty. If the values of a random variable are listed with their corresponding probabilities, then we have the so-called probability distribution for the random variable.

In arithmetic, the average of a few numbers is the sum of these numbers divided by the number of the values added. For example, the average of 1, 2, 3, and 4 is

\[\frac{1+2+3+4}{4}=2.5\] We can write

\[\frac{1+2+3+4}{4}=\frac{1}{4}+\frac{2}{4}+\frac{3}{4}+\frac{4}{4}=1\cdot\frac{1}{4}+2\cdot \frac{1}{4}+3\cdot \frac{1}{4}+4\cdot\frac{1}{4}\] and we call \[1\cdot\frac{1}{4}+2\cdot \frac{1}{4}+3\cdot \frac{1}{4}+4\cdot\frac{1}{4}\]

the weighted average of the numbers 1, 2, 3, and 4, with weights \(\frac{1}{4}\), \(\frac{1}{4}\),\(\frac{1}{4}\), and \(\frac{1}{4}\). That is, the 4 numbers are equally weighted.

We can use different weights. For example, \[1\cdot (0.3)+2\cdot (0.2)+3\cdot (0.25)+4\cdot(0.25)\] is another weighted average of the same 4 numbers 1, 2, 3, and 4, but with weights 0.3, 0.2, 0.25, and 0.25, respectively.

The weights in a weighted average must be nonnegative and add up to 1. The following is not a weighted average

\[12\cdot (0.2)+20\cdot (0.5)+30\cdot (0.1)+6\cdot(0.1) + 50\cdot (0.05)\] since the 5 weights add up to \(0.2+0.5+0.1+0.1+0.05=0.95\), which is not 1.

The expectation (or mean) of a random variable is the expected value for the random variable. It is the weighted average of the its possible with weights being the corresponding probabilities. That is, the expectation is obtained by first multiplying its possible values by the corresponding probabilities and then adding all the products.

Example 4.

Roll a 6-sided die only once. Let \(X\) denote the number of points turning up. \(X\) is a random variable, and its probability distribution is

A Probability Distribution
X \(P(X=x)\)
1 \(\frac{1}{6}\)
2 \(\frac{1}{6}\)
3 \(\frac{1}{6}\)
4 \(\frac{1}{6}\)
5 \(\frac{1}{6}\)
6 \(\frac{1}{6}\)
Note:

The expectation of \(X\) equals \[(1)(\frac{1}{6})+(2)(\frac{1}{6})+(3)(\frac{1}{6})+(4)(\frac{1}{6})+(5)(\frac{1}{6})+(6)(\frac{1}{6})=\frac{1}{6}+\frac{2}{6}+\frac{3}{6}+\frac{4}{6}+\frac{5}{6}+\frac{6}{6}=\frac{21}{6}=3.5\]

Example 5.

If the probability distribution of a random variable \(Y\) is

Another Probability Distribution
Y \(P(Y=y)\)
\(-15\) \(\frac{1}{8}\)
0 \(\frac{2}{8}\)
2 \(\frac{2}{8}\)
4 \(\frac{3}{8}\)
Note:

Find the expectation of \(Y\).

The expectation of \(Y\) equals \[(-15)(\frac{1}{8})+(0)(\frac{2}{8})+(2)(\frac{2}{8})+(4)(\frac{3}{6})=\frac{-15}{8}+0+\frac{4}{8}+\frac{12}{8}=\frac{1}{8}=0.125.\]

Example 6.

A lab has 2 computers. About 20% of the time, the computers are idle; about 35% of the time, exactly one computer is being used. Find the expectation for the number of computers that are idle any time.

Solution.

Let \(X\) denote the number of computers that are idle at any time. Then the distribution of \(X\) can be modeled by

A Probability Model for a Computer Lab
X \(P(X=x)\)
0 0.45
1 0.35
2 0.20
Note:
How I got 0.20? Since the there are only 3 possiblities: either 0, 1, or 2 computers are working, the 3 probabilities must add up to 1.

The expectation is:

\[(0)(0.45)+(1)(0.35)+(2)(0.20)= 0+0.35+0.4=0.75\]

Measuring Risk

Example 6.

Suppose that you roll a single die. If an odd number (1,3,5) comes up, you win the amount of your roll ($1, $3, or $5 respectively). If an even number 2, 4, or 6 comes up, you have to pay the house the amount of your roll ($2, $4, or $6 respectively). (a) Find the expected payoff for this game. (b) Is this a fair game? Explain.

Solution.

Let \(X\) be the amount you win. Then \(X\) is a random variable. Its probability distribution is

A Probability Distribution for a Game
X \(P(X=x)\)
1 \(\frac{1}{6}\)
\(-2\) \(\frac{1}{6}\)
3 \(\frac{1}{6}\)
\(-4\) \(\frac{1}{6}\)
5 \(\frac{1}{6}\)
\(-6\) \(\frac{1}{6}\)
Note:
Here, -4 means a loss of $4.
  1. The expected payoff is just the expectation of \(X\), which equals \[(1)(\frac{1}{6})+(-2)(\frac{1}{6})+(3)(\frac{1}{6})+(-4)(\frac{1}{6})+(5)(\frac{1}{6})+(-6)(\frac{1}{6})=\frac{1}{6}-\frac{2}{6}+\frac{3}{6}-\frac{4}{6}+\frac{5}{6}-\frac{6}{6}=-\frac{3}{6}=-0.5\]

  2. This is not a fair game to play. The expectation of \(-0.5\) means that you are expected to lose 50 cents for each play of the game. For it to be a fair game, the house should pay you 50 cents for each play of the game.

Example 7.

The grading of Prof. Bestie’s Math 103 is based on homework, two midterms, two projects, and a final exam. The highest possible points are 100 for homework, 50 for each midterm, 20 for each project, and 200 for the final exam. The weights are 20% for homework, 15% for each midterm, 10% for each project, and 30% for the final exam. A student needs to average 90% or above to get an A grade. Tom is a student in this class. His scores are 88 for homework, 42 for midterm one, 45 for midterm two, 15 for project one, 18 for project two, and 184 for the final exam. Will he get an A?

Solution.

Tom gets 88% of the possible score on homework, 42 out of 50 or 84% of the possible score on midterm one, 45 out of 50 or 90% of the possible score on midterm two, 15 out of 20 or 75% of the possible score on project one, 18 out of 20 or 90% of the possible score on project two, and 184 out of 200 or 92% of the possible score on final exam. The two 90%’s are not going to contribute the same to Tom’s final grade, since midterm two and project two are weighted differently (that is, not equally important). The weighted average of these percentages is

\[(88\%)(20\%)+(84\%)(15\%)+(90\%)(15\%)+(75\%)(10\%)+(90\%)(10\%)+(92\%)(30\%)=87.80\%\]

Tom can’t get an A, but close.

Suggested Exercises from Textbook

17, 49, 51, 57, 59

Chapter 17. The Mathematics of Normality

A probability distribution describes a random phenomenon by telling what outcomes are possible and what their probabilities are.

If we do not know the probability distribution for a random phenomenon, we can, according to our knowledge, reasonably assume what it is. Such assumed distribution is called a probability model, which describes a random phenomenon by telling what outcomes are possible and how to assign probabilities to them. A probability model may not be correct, and it only serves an approximation to the truth.

In Chapter 16, we introduced probability models for random variables that take finite possible values. Such random variables are called discrete random variables. In that case, a probability is assigned to each individual outcome of the random variable.

The second kind of probability model assigns probabilities as areas under a so-called density curve. This kind of model is only for a random variable that takes values spread on the number line or a part of it. To find the probability that such a random variable is below (above) a given number, just find the area under the density curve and below (above) this given number. To find the probability that such a random variable is between two given numbers, just find the area under the density curve and between these two given numbers.

The Normal Distribution

For a curve to be a density curve, it must be above the x-axis and the total area under the curve must be 1. If the density curve looks like a bell-shaped, symmetric curve, the density is called a normal density. If a random variable can be modeled with a normal density, we say the random variable has a normal distribution. The center of the normal curve corresponds to the mean of the normal distribution. Another feature of the normal distribution is its standard deviation, which measures the spread of the distribution.

The following is a tool that allows us to calculate the area of a given region under a normal density curve.

NormalCalculator

To use this calculator, visit https://onlinestatbook.com/2/calculators/normal_dist.html

The above picture shows how you can find the area above 120 under the curve of the normal distribution with mean 100 and standard deviation 15. The shaded area is what is wanted and it is 0.0912. It means that with probability 0.0912, a randomly selected person has IQ above 120. It also means that 9.12% of all people have IQ’s above 120.

Open the link and try to change the mean and standard deviation to see how the normal density curve changes. Now, fill in blanks:

  • When I only change the mean, the _____ (center or spread) of the curve remains the same but the _____ (center or spread) changes.

  • When I only change the standard deviation, the _____ (center or spread) of the curve remains the same but the _____ (center or spread) changes.

  • I really _____ (like or dislike) normal curves.

The answers are: spread, center, center, spread, answer varies

This calculator: https://onlinestatbook.com/2/calculators/inverse_normal_dist.html allows you go backwards. That is, knowing the left or right area under a normal density curve, you can find the corresponding cutoff on the number line. For example, setting “Area” to 0.35, choosing “Below”, and then clicking “Recalculate” gives 94.22, which is the cutoff on the number line. It means 35% of people have IQ below 94.22. Similarly, setting “Area” to 0.45, choosing “Above”, and then clicking “Recalculate” gives 101.88, which is the cutoff on the number line. It means 45% of people have IQ above 101.88.

Example 1.

The IQ of human beings has a normal distribution with mean 100 and standard deviation 15.

  1. What is the probability that a randomly selected person has IQ greater than 130?

  2. What percent of people have IQ above 130? (This question is essentially the same as the previous one.)

  3. What percent of people have IQ between 110 and 125? (Your answer should be 0.2047 or 20.47%)

  4. What percent of people have IQ between 90 and 105?

  5. What percent of people have IQ below 95?

We can use this calculator: https://onlinestatbook.com/2/calculators/inverse_normal_dist.html to do the following:

  1. What is the 25th percentile (also called the first quartile and denoted by \(Q_1\))? That is, 25% of people have IQ below what?

  2. What is the 75th percentile (also called the first quartile and denoted by \(Q_3\))? That is, 75% of people have IQ below what? or 25% of people have IQ above what? (keep in mind: the area under the normal curve equals 1.)

  3. What is the inter-quartile range (IQR, which is \(Q_3 -Q_1\))?

  4. What is the 95th percentile? That is, 95% of people have IQ below what? or 5% of people have IQ above what?

The answers are: 0.0228, 2.28%, 20.47%, 37.81%, 36.94%, 89.88, 110.12, 29.24, 124.67

Example 2.

The heights of adult men are normally distributed with mean 69.2 inches and 2.66 inches

  1. What is the probability that a randomly selected person is taller than 72?

  2. What percent of people are taller than 75 inches?

  3. What percent of people are shorter than 65 inches?

  4. What percent of people between 68 inches and 73 inches tall?

  5. What is the 25th percentile?

  6. What is the third quartile?

  7. What is the 95th percentile?

The answers are: 0.1463, 1.46%, 5.72%, 59.75%, 67.41 cm, 70.99 cm, 73.58 cm

The 68-95-99.7 Rule

If random variable \(X\) has a normal distribution, then, no matter what the mean and standard deviation is,

  • with probability about 68%, the random variable falls within one standard deviation above or below the mean;

  • with probability about 95%, the random variable falls within two standard deviations above or below the mean;

  • with probability about 99.7%, the random variable falls within three standard deviation above or below the mean;

The rule is depicted in the following graph:

Source: https://towardsdatascience.com/understanding-the-68-95-99-7-rule-for-a-normal-distribution-b7b7cbf760c2

The web page here https://www.chegg.com/homework-help/definitions/normal-curve-31 shows a more detailed graph than the above one. Based on the graph on this web page, we infer that

  • The probability that the random variable falls between the point that is two standard deviations below the mean and the point that is one standard deviation above the mean is approximately 0.8285 (13.59% + 34.13% + 34.13% = 82.85% or 0.8285).

Here is a nice summary about the normal distribution: https://blog.masterofproject.com/normal-distribution-curve/ and its use in six-sigma.

Quartiles and the Inter-Quartile of a Normal Distribution

In a normal distribution with mean \(\mu\) and standard deviation \(\sigma\), approximately,

  • First quartile: \(Q_1\approx \mu-0.6745\sigma\). That is, with 25% chance normal values are less than \(Q_1\).

  • Third quartile: \(Q_3\approx\mu+0.6745\sigma\). That is, with 75% chance normal values are less than \(Q_3\).

  • Inter-quartile: \(IQR\approx1.35\sigma\). That is, for a normal distribution, the distance between the two quartiles is approximately \(1.35\sigma\).

Standardized Values

Each value in a normal distribution with mean \(\mu\) and standard deviation \(\sigma\) can be standardized to become a standardized value often referred to as a \(z\)-value or \(z\)-score. The \(z\)-value of a raw value \(x\) is defined as

\[z=\frac{x-\mu}{\sigma}\]

Such standardization removes the original unit; that is, \(z\)-values are unitless. This makes comparison between values from different normal distributions easier.

Example 3.

Tom took both an SAT test and an ACT test, with 1270 in SAT and 26 in ACT. The distribution of SAT scores is normal with mean 1060 and standard deviation 217. The distribution of ACT scores is also normal with mean 18 and standard deviation 6.

Which score is relatively better?

Solution.

The z-score of Tom’s SAT is \(z=\frac{1270-1060}{217}=0.9677\).

The z-score of Tom’s ACT is \(z=\frac{26-18}{6}=1.33\).

Since The \(z\)-score of ACT is higher, Tom did better in ACT.

References

The book exercises can be accessed here: https://www.numerade.com/books/excursions-in-modern-mathematics-9th/