Question 1. Which movie in 2024 had the most days as the number one grossing film?
Question 2. How many days in 2024 was this movie at number one.
Pivot Tables - Descriptive Statistics
Worksheet 1 contains U.S. domestic flight data for November and December
How many PASSENGERS flew from San Francisco to Chicago, IL on UNITED?
There is no need to parse text in Worksheet 1 so start by converting these data to an Excel Table (Insert > Table):
Descriptive Statistics Cont’d
Click any cell within table, so that ‘Table Design’ tab is visible.
Click Summarize with PivotTable.
Put the Pivot Table on the Existing Worksheet in location F1.
Specify the Pivot Table fields as shown.
Lecture 5 In-class Exercise - Q3
Question 3. How many CARRIERS (airlines) are included in the flights data set on Worksheet 1
Descriptive Statistics Cont’d
Insert slicers for
ORIGIN: San Francisco, CA
DESTINATION: Chicago, Il
OPTIONAL: Click the comma in the Number group and round to whole numbers to make the data more readable.
Lecture 5 In-class Exercise - Q4-Q6
Question 4. How many CARRIERS flew from San Francisco to Chicago?
Question 5. How many passengers flew on United from San Francisco to Chicago.
Round answer to closest thousand (K).
Question 6. What percentage of all passengers flying this route flew on United (in Worksheet 1).
HINT: Right click on the Sum of Passengers column and select the Show Values As option, % of Column Total.
Round answer to closest whole number.
Parsing and Summarizing Data
Worksheet 2 also contains flight passenger data.
We want to know how many passengers on Southwest Airlines flew from anywhere in Illinois (IL) to anywhere in Pennsylvania (PA).
In order to answer this question, we have to first parse the ORIGIN / DESTINATION in to four separate columns:
ORIGIN CITY
ORIGIN STATE
DESTINATION CITY
DESTINATION STATE
Parsing with two Delimiters
Select the ORIGIN > DESTINATION column.
Click Text to Columns on the Data tab.
Specify Delimited, default, in the pop-up and click next.
Update the column headers to be the following.
ORIGIN CITY, ORIGIN STATE
DESTINATION CITY, DESTINATION STATE
Summarizing Parsed Data
Convert Worksheet 2 to an Excel Table (Insert > Table)
Click Summarize with PivotTable.
Put the Pivot Table on the Existing Worksheet in location H1.
Specify the fields in the Pivot Table as shown.
Summarizing Parsed Data Cont’d
Click on any cell in the Pivot Table and click Insert Slicers
Select ORIGIN STATE and DESTINATION STATE and click OK.
Select IL in the ORIGIN STATE slicer and PA in the DESTINATION STATE slicer.
Answer the following questions using the filtered parsed data.
Lecture 5 In-class Exercise - Q7-Q8
Question 7. How many passengers on Southwest Airlines flew from anywhere in Illinois (IL) to anywhere in Pennsylvania (PA).
Round answer to closest thousand (K).
Question 8. What percentage of all passengers flying between these two state flew on Southwest?
HINT: Right click on the Sum of Passengers column and select the Show Values As option, % of Column Total.
Summarizing Data by Group
Worksheet 3 contains the Titanic data from previous lectures.
Note that all duplicates have already been removed.
We want to determine what percent of all survivors were under 10 years old.
Convert these data to an Excel Table (Insert > Table).
Click Summarize with PivotTable.
Put the Pivot Table on the Existing Worksheet in location H1.
Summarizing Data by Group Cont’d
Start by setting up the Pivot Table Fields as shown.
Notice that this creates a column for each age which is not ideal.
Grouping Data
Click on any of the column headers that you want to group.
Right-click and click Group in the menu that appears.
Complete the pop-up box as shown.
Lecture 5 In-class Exercise - Q9-Q11
Question 9. How many survivors of the Titanic were under 10 years old?
Question 10. What percent all survivors of the titanic were under 10 years old?
HINT: Right click on the Sum of Passengers column and select the Show Values As option, % of Row Total.
Question 11. Children under 10 had the highest survival rate (percentage) of any age group. Which age group had the second highest survival rate (percentage)?
HINT: Switch the Show Values As option, % of Column Total.
When you are done, clear the Pivot Table fields.
Group Summaries
Now we will find the average fare for each class of ticket.
Start by setting up the Pivot Table Fields as shown.
Click on a numeric value, and right click and then select Average in the Summarize Values By menu.
Lecture 5 In-class Exercise - Q12
Question 12. What was the average fare paid by 2nd class passengers on the Titanic?
HW 3 Demo
Using Worksheet 4 which includes data from last week, you will answer the following question in HW 3.
How many passengers flew to Rapid City, SD in 2024?
Parse the ORIGIN-DESTINATION column into two columns.
Convert the data to a table (Insert > Table)
Create a Pivot Table with
DESTINATION in the Rows field
NO. OF PASSENGERSin the Values field.
Key Points from Today
Pivot Tables
Lecture 4 and 5 demonstrate the use and efficiency of the Pivot Table tools.
Pivot Tables with Slicers
Slicers are an efficient way to filter data with a large number of categories.
Useful Facts from Today:
Parsing and Concatenating text should be done BEFORE creating an Excel Table.
PivotTable data can be presented as counts or percentages.
Sums can be presented as averages or ANY summary statistic to answer a question-
HW 2 was due Monday, 1/27/25 and HW 3 is due Monday, 2/3/25.
To submit an Engagement Question or Comment about material from Lecture 5: Submit it by midnight today (day of lecture).
Source Code
---title: "BUA 345 - Lecture 5"subtitle: "Descriptive Analytics with Pivot Tables Cont'd"author: "Penelope Pooler Eisenbies"date: last-modifiedlightbox: truetoc: truetoc-depth: 3toc-location: lefttoc-title: "Table of Contents"toc-expand: 1format: html: code-line-numbers: true code-fold: true code-tools: trueexecute: echo: fenced---## Housekeeping```{r setup, echo=FALSE, warning=F, message=F, include=F}#| include: false# this line specifies options for default options for all R Chunksknitr::opts_chunk$set(echo=F)# suppress scientific notationoptions(scipen=100)# install helper package that loads and installs other packages, if neededif (!require("pacman")) install.packages("pacman", repos = "http://lib.stat.cmu.edu/R/CRAN/")# install and load required packagespacman::p_load(pacman,tidyverse, magrittr, olsrr, shadowtext, mapproj, knitr, kableExtra, countrycode, usdata, maps, RColorBrewer, gridExtra, ggthemes, gt, mosaicData, epiDisplay, vistributions, psych, tidyquant, dygraphs)# verify packages# p_loaded()```**HW 2 is due Monday, 1/27/25****HW 3 is now posted and is due 2/3/25**### Today's plan- Pivot Tables Continued - Answering questions with multiple skills- In Class preview of HW 3 - Question 1<br>::: fragment**In-class Polling (Session ID: bua345s25)**:::## ### Lecture 5 In-class Exercise - Q1-Q2{fig-align="center"}<br>**Question 1.** Which movie in 2024 had the most days as the number one grossing film?<br>**Question 2.** How many days in 2024 was this movie at number one.## Pivot Tables - Descriptive Statistics- Worksheet 1 contains U.S. domestic flight data for November and December- How many PASSENGERS flew from San Francisco to Chicago, IL on UNITED?::: fragment{fig-align="center"}:::- There is no need to parse text in Worksheet 1 so start by converting these data to an Excel Table (`Insert`\>`Table`):::: fragment{fig-align="center"}:::## :::::::: columns::: {.column width="48%"}### Descriptive Statistics Cont'd1. Click any cell within table, so that ‘Table Design’ tab is visible.2. Click `Summarize with PivotTable`.3. Put the Pivot Table on the `Existing Worksheet` in location `F1`.4. Specify the Pivot Table fields as shown.:::::: {.column width="4%"}:::::::: {.column width="48%"}::: fragment{fig-align="center" height="2in"}:::::: fragment{fig-align="center" height="4in"}::::::::::::::::## ### Lecture 5 In-class Exercise - Q3<br>**Question 3.** How many CARRIERS (airlines) are included in the flights data set on Worksheet 1## ::::::::: columns:::: {.column width="48%"}### Descriptive Statistics Cont'd5. Insert slicers for - `ORIGIN`: San Francisco, CA - `DESTINATION`: Chicago, Il6. OPTIONAL: Click the comma in the `Number` group and round to whole numbers to make the data more readable.::: fragment{fig-align="center"}:::::::::: {.column width="4%"}:::::::: {.column width="48%"}::: fragment{fig-align="center"}:::::: fragment{fig-align="center"}:::::::::::::::::## ### Lecture 5 In-class Exercise - Q4-Q6<br>**Question 4.** How many CARRIERS flew from San Francisco to Chicago?<br>**Question 5.** How many passengers flew on United from San Francisco to Chicago.- Round answer to closest thousand (K).<br>**Question 6.** What percentage of all passengers flying this route flew on United (in Worksheet 1).- **HINT:** Right click on the Sum of Passengers column and select the `Show Values As` option, `% of Column Total`.- Round answer to closest whole number.## Parsing and Summarizing Data- Worksheet 2 also contains flight passenger data.- We want to know how many passengers on Southwest Airlines flew from anywhere in Illinois (IL) to anywhere in Pennsylvania (PA).- In order to answer this question, we have to first parse the `ORIGIN / DESTINATION` in to four separate columns: - ORIGIN CITY - ORIGIN STATE - DESTINATION CITY - DESTINATION STATE## :::::::: columns::: {.column width="58%"}### Parsing with two Delimiters1. Select the `ORIGIN > DESTINATION` column.2. Click `Text to Columns` on the `Data` tab.3. Specify `Delimited`, default, in the pop-up and click next.4. Update the column headers to be the following. - ORIGIN CITY, ORIGIN STATE - DESTINATION CITY, DESTINATION STATE:::::: {.column width="4%"}:::::::: {.column width="38%"}::: fragment{fig-align="center" height="2.5in"}:::::: fragment{fig-align="center" height="2.5in"}::::::::::::::::::: fragment{fig-align="center" height=".75in"}:::## ::::::: columns::: {.column width="48%"}### Summarizing Parsed Data1. Convert Worksheet 2 to an Excel Table (`Insert`\>`Table`)2. Click `Summarize with PivotTable`.3. Put the Pivot Table on the `Existing Worksheet` in location `H1`.4. Specify the fields in the Pivot Table as shown.:::::: {.column width="4%"}::::::: {.column width="48%"}::: fragment{fig-align="center"}::::::::::::::## :::::::: columns::: {.column width="48%"}### Summarizing Parsed Data Cont'd1. Click on any cell in the Pivot Table and click `Insert Slicers`2. Select `ORIGIN STATE` and `DESTINATION STATE` and click OK.3. Select `IL` in the `ORIGIN STATE` slicer and `PA` in the `DESTINATION STATE` slicer.4. Answer the following questions using the filtered parsed data.:::::: {.column width="4%"}:::::::: {.column width="48%"}::: fragment{fig-align="center" height="2.5in"}:::::: fragment{fig-align="center" height="2.5in"}::::::::::::::::## ### Lecture 5 In-class Exercise - Q7-Q8<br>**Question 7.** How many passengers on Southwest Airlines flew from anywhere in Illinois (IL) to anywhere in Pennsylvania (PA).- Round answer to closest thousand (K).<br>**Question 8.** What percentage of all passengers flying between these two state flew on Southwest?- **HINT:** Right click on the Sum of Passengers column and select the `Show Values As` option, `% of Column Total`.## :::::::: columns::: {.column width="48%"}### Summarizing Data by GroupWorksheet 3 contains the Titanic data from previous lectures.- Note that all duplicates have already been removed.- **We want to determine what percent of all survivors were under 10 years old.**1. Convert these data to an Excel Table (`Insert`\>`Table`).2. Click `Summarize with PivotTable`.3. Put the Pivot Table on the `Existing Worksheet` in location `H1`.:::::: {.column width="4%"}:::::::: {.column width="48%"}::: fragment{fig-align="center" height="2in"}:::::: fragment{fig-align="center"}::::::::::::::::## ::::::: columns::: {.column width="48%"}### Summarizing Data by Group Cont'd1. Start by setting up the Pivot Table Fields as shown.2. Notice that this creates a column for each age which is not ideal.:::::: {.column width="4%"}::::::: {.column width="48%"}::: fragment{fig-align="center" height="4in"}::::::::::::::::: fragment{fig-align="center"}:::## :::::::: columns::: {.column width="48%"}### Grouping Data1. Click on any of the column headers that you want to group.2. Right-click and click Group in the menu that appears.3. Complete the pop-up box as shown.:::::: {.column width="4%"}:::::::: {.column width="48%"}::: fragment{fig-align="center" height="2in"}:::::: fragment{fig-align="center" height="2in"}::::::::::::::::::: fragment{fig-align="center"}:::## ### Lecture 5 In-class Exercise - Q9-Q11<br>**Question 9.** How many survivors of the Titanic were under 10 years old?<br>**Question 10.** What percent all survivors of the titanic were under 10 years old?- **HINT:** Right click on the Sum of Passengers column and select the `Show Values As` option, `% of Row Total`.<br>**Question 11.** Children under 10 had the highest survival rate (percentage) of any age group. Which age group had the second highest survival rate (percentage)?- **HINT:** Switch the `Show Values As` option, `% of Column Total`.- **When you are done, clear the Pivot Table fields.**## :::::::: columns::: {.column width="48%"}### Group SummariesNow we will find the average fare for each class of ticket.1. Start by setting up the Pivot Table Fields as shown.2. Click on a numeric value, and right click and then select `Average` in the `Summarize Values By` menu.:::::: {.column width="4%"}:::::::: {.column width="48%"}::: fragment{fig-align="center" height="3in"}:::::: fragment{fig-align="center" height="3in"}::::::::::::::::## ### Lecture 5 In-class Exercise - Q12<br>**Question 12.** What was the average fare paid by 2nd class passengers on the Titanic?## HW 3 DemoUsing Worksheet 4 which includes data from last week, you will answer the following question in HW 3.- **How many passengers flew to Rapid City, SD in 2024?**1. Parse the `ORIGIN-DESTINATION` column into two columns.2. Convert the data to a table (`Insert`\>`Table`)3. Create a Pivot Table with - `DESTINATION` in the `Rows` field - `NO. OF PASSENGERS`in the `Values` field.::: fragment:::## ### Key Points from Today::: fragment**Pivot Tables**:::- Lecture 4 and 5 demonstrate the use and efficiency of the Pivot Table tools.::: fragment**Pivot Tables with Slicers**:::- Slicers are an efficient way to filter data with a large number of categories.::: fragment**Useful Facts from Today:**:::- Parsing and Concatenating text should be done BEFORE creating an Excel Table.- PivotTable data can be presented as counts or percentages.- Sums can be presented as averages or ANY summary statistic to answer a question-::: fragment**HW 2 was due Monday, 1/27/25 and HW 3 is due Monday, 2/3/25.****To submit an Engagement Question or Comment about material from Lecture 5:** Submit it by midnight today (day of lecture).:::