The Pivot Table below shows the COUNTS of Titanic passengers in each class that did or did not survive the disaster.
Question 1. If we want to know the percent of passengers that survived (Yes) that were in second class, what option should we select for Show Values As?
A. % of Column Total
B. % of Row Total
C. % of Grand Total
Question 2. If we want to know the percent of ALL passengers that were in third class and did not survive, what option should we select for Show Values As?
Data Set Queries
A useful skill is to be able to find specific observations quickly in a large database.
Excel facilitates this process with these commands:
VLOOKUP
MATCH
You are working for a Non-profit that offers celebrity autographs in exchange for donations, so you need to contact celebrities.
This first worksheet only has 4 observations, but what if it had 400?
You want to find a phone number quickly, just based on STAGE NAME.
We will create a VLOOKUP command to do this.
VLOOKUP Command
We will use VLOOKUP to automate a datebase search.
We enter Lady Gaga in the cell next to ‘STAGE NAME’
We create a VLOOKUP function to output correct phone number in cell next to ‘FAN PHONE NO.’
VLOOKUP inputs
G2: Where reference value is entered (see Worksheet 1)
B2:D5: Data range to be searched
Note: VLOOKUP requires that Column with reference value is FIRST COLUMN (Index = 1)
STAGE NAME (Column B) is first column
3: Column index where query answer is.
FAN PHONE NO is 3rd column counting from STAGE NAME
FALSE: We enter FALSE to indicate we want an EXACT MATCH
VLOOKUP Example 2
Now you have a different database (Worksheet 2) where STAGE NAME is listed in COLUMN A.
Using this new database, we want to find Rihanna’s phone number
Updating the inputs
G2: Where reference value is entered (see Worksheet 1)
??:???: Data range to be searched
Note: VLOOKUP requires that Column with reference value is FIRST COLUMN (Index = 1)
STAGE NAME (Column A) is first column
?: Column index where query answer is.
FAN PHONE NO is 4th column counting from STAGE NAME
FALSE: We enter FALSE to indicate we want an EXACT MATCH
Lecture 6 In-class Exercise - Q3
We want to use
=VLOOKUP(G2, ?, ?, FALSE) to find Rihanna’s phone number in the database in Worksheet 2.
What should the missing inputs be?
HINT: There are two correct choices and either choice will receive full credit.
A.…, B2:D5, 3, …
B.…, A1:D5, 4, …
C.…, A2:D5, 3, …
D.…, A2:D5, 4, …
E.…, B1:D5, 3, …
VLOOKUP
Error without Data Validation
Using the Worksheet 2 Database, we want to find Beyonce’s phone number phone number, but there’s a common problem.
The accent in Beyonce’s name causes an error.
How do we deal with this?
We use ‘Data Validation’ to restrict options for G2 to ONLY VALID entries.
Without Data Validation:
Adding Data Validation
Click cell G2
Select the Data Validation icon on the Data tab.
Complete the pop-box as shown and click OK.
Now cell G2 has a drop-down menu of correct values.
Another Data Validation Example
Worksheet 3 contains weather data. December 2023 to February 2024
We want to create a query that will output a value when the user specifies a date AND a weather variable.
Our first VLOOKUP command has to be manually changed if we want to change the weather variable, e.g from min. temp to snowfall.
Then we can automate this change with an embedded MATCH command.
Data Validation
Next steps are to add data validation to cells H2 and H3
We want a pull-down menu in H2 so that only one of the dates can be selected.
We want a pull-down menu in H3 so that only one of the weather variables can be chosen.
Cell H2 Data Validation:
Cell H3 Data Validation:
Lecture 6 In-class Exercise - Q4
If we want to know the snowfall for different dates, what would change in our VLOOKUP command?
HINT: The third input before FALSE is the only input that needs to change.
A. It would change from 3 to 1.
B. It would change from 3 to 2.
C. It would change from 3 to 4.
D. It would change from 3 to 5.
How do we correct the incorrect VLOOKUP below?
Match Command
We can automate the column choice by adding a MATCH command
How do we AUTOMATE the lookup column in the VLOOKUP below?
In place of the third input we embed a MATCH command.
Replace third input with MATCH command:
H3: Input value we want to match
B1:E1: Array of cells to search for input value
0: Type of match. 0 indicates EXACT MATCH
VLOOKUP with Range Option
We begin this example by creating a range lookup table.
Here are the ranges for Hurricane categories.
We need to convert these to ranges that Excel can understand.
Range lookup will only work with lower limits in ASCENDING ORDER.
Here are the CORRECT inputs for a Range Lookup table for Hurricane categories.
Update the table in columns D and E on Worksheet 4.
Putting all the Skills Together
In Cell B1, we will add data validation so that we have a pull-down list of Hurricane names.
In Cell B2, we will create a VLOOKUP command that outputs the maximum wind speed.
In cell B3 we will create a VLOOKUP with a range option that will output the category of the hurricaine.
Lecture 6 In-class Exercise - Q5
Based on the completed query table, what category was Hurricane Rina?
Enter 1, 2, 3, 4, or 5.
Key Points from Today
Queries using VLOOKUP command
Useful for creating automated searches
VLOOKUP very useful for LARGE data sets
VLOOKUP Query with ‘Data Validation’
Prevents user error due to typos or complex text (accents, etc.)
VLOOKUP with MATCH command
Enables creation of more complex automated queries
VLOOKUP with RANGE lookup
Outputs answers by determining interval that includes reference value entry
HW 3 is due 2/3/2025
To submit an Engagement Question or Comment about material from Lecture 6: Submit it by midnight today (day of lecture).
Source Code
---title: "BUA 345 - Lecture 6"subtitle: "Data Queries"author: "Penelope Pooler Eisenbies"date: last-modifiedlightbox: truetoc: truetoc-depth: 3toc-location: lefttoc-title: "Table of Contents"toc-expand: 1format: html: code-line-numbers: true code-fold: true code-tools: trueexecute: echo: fenced---## Housekeeping```{r setup, echo=FALSE, warning=F, message=F, include=F}#| include: false# this line specifies options for default options for all R Chunksknitr::opts_chunk$set(echo=F)# suppress scientific notationoptions(scipen=100)# install helper package that loads and installs other packages, if neededif (!require("pacman")) install.packages("pacman", repos = "http://lib.stat.cmu.edu/R/CRAN/")# install and load required packagespacman::p_load(pacman,tidyverse, magrittr, olsrr, shadowtext, mapproj, knitr, kableExtra, countrycode, usdata, maps, RColorBrewer, gridExtra, ggthemes, gt, mosaicData, epiDisplay, vistributions, psych, tidyquant, dygraphs)# verify packages# p_loaded()```**HW 3 is due 2/3/2025****Sign up for a [FREE Posit Cloud Account](https://posit.cloud/plans/free){target="_blank"} before Tuesday, 2/4/25**### Today's plan- Data Set Queries - `VLOOKUP` - `VLOOKUP` with Data Validation - `VLOOKUP` with MATCH command - `VLOOKUP` with Range option::: fragment**In-class Polling (Session ID: bua345s25)**:::## ### Lecture 6 In-class Exercise - Q1-Q2The Pivot Table below shows the COUNTS of Titanic passengers in each class that did or did not survive the disaster.{fig-align="center"}**Question 1.** If we want to know the percent of passengers that survived (Yes) that were in second class, what option should we select for `Show Values As`?`A.` % of Column Total`B.` % of Row Total`C.` % of Grand Total<br>**Question 2.** If we want to know the percent of ALL passengers that were in third class and did not survive, what option should we select for `Show Values As`?## :::::::: columns::: {.column width="48%"}### Data Set Queries- A useful skill is to be able to find specific observations quickly in a large database.- Excel facilitates this process with these commands: - VLOOKUP - MATCH- You are working for a Non-profit that offers celebrity autographs in exchange for donations, so you need to contact celebrities.- This first worksheet only has 4 observations, but what if it had 400?- You want to find a phone number quickly, just based on STAGE NAME.:::::: {.column width="4%"}:::::::: {.column width="48%"}::: fragment{fig-align="center" height="2.5in"}:::::: fragment{fig-align="center" height="2.5in"}:::- We will create a `VLOOKUP` command to do this.:::::::::::::## `VLOOKUP` Command:::::::: columns::: {.column width="48%"}We will use `VLOOKUP` to automate a datebase search.- We enter Lady Gaga in the cell next to ‘STAGE NAME’- We create a `VLOOKUP` function to output correct phone number in cell next to ‘FAN PHONE NO.’:::::: {.column width="4%"}:::::::: {.column width="48%"}::: fragment{fig-align="center" height="2.5in"}:::::: fragment{fig-align="center" height="2in"}::::::::::::::::## :::::: columns::: {.column width="48%"}### `VLOOKUP` inputs- **`G2`:** Where reference value is entered (see Worksheet 1)- **`B2:D5`:** Data range to be searched - Note: VLOOKUP requires that Column with reference value is FIRST COLUMN (Index = 1) - STAGE NAME (Column B) is first column- **`3`:** Column index where query answer is. - FAN PHONE NO is 3rd column counting from STAGE NAME- **`FALSE`:** We enter FALSE to indicate we want an EXACT MATCH:::::: {.column width="4%"}:::::: {.column width="48%"}{fig-align="center" height="2.5in"}{fig-align="center" height="2in"}:::::::::## `VLOOKUP` Example 2:::::: columns::: {.column width="48%"}- Now you have a different database (Worksheet 2) where STAGE NAME is listed in COLUMN A.- Using this new database, we want to find Rihanna’s phone number:::::: {.column width="4%"}:::::: {.column width="48%"}{fig-align="center" height="2.5in"}{fig-align="center" height="2in"}:::::::::## :::::: columns::: {.column width="48%"}### Updating the inputs- **`G2`:** Where reference value is entered (see Worksheet 1)- **`??:???`:** Data range to be searched - Note: VLOOKUP requires that Column with reference value is FIRST COLUMN (Index = 1) - STAGE NAME (**Column A**) is first column- **`?`:** Column index where query answer is. - FAN PHONE NO is 4th column counting from STAGE NAME- **`FALSE`:** We enter FALSE to indicate we want an EXACT MATCH:::::: {.column width="4%"}:::::: {.column width="48%"}{fig-align="center" height="2.5in"}{fig-align="center" height="2in"}:::::::::## ### Lecture 6 In-class Exercise - Q3:::::: columns::: {.column width="48%"}We want to use`=VLOOKUP(G2, ?, ?, FALSE)` to find Rihanna’s phone number in the database in Worksheet 2.**What should the missing inputs be?****HINT:** There are two correct choices and either choice will receive full credit.**A.** `…, B2:D5, 3, …`**B.** `…, A1:D5, 4, …`**C.** `…, A2:D5, 3, …`**D.** `…, A2:D5, 4, …`**E.** `…, B1:D5, 3, …`:::::: {.column width="4%"}:::::: {.column width="48%"}{fig-align="center" height="2.5in"}{fig-align="center" height="2in"}:::::::::## VLOOKUP::::::: columns::: {.column width="48%"}### Error without Data ValidationUsing the Worksheet 2 Database, we want to find Beyonce’s phone number phone number, but there’s a common problem.- The accent in Beyonce’s name causes an error.- How do we deal with this?- We use ‘Data Validation’ to restrict options for `G2` to ONLY VALID entries.:::::: {.column width="4%"}::::::: {.column width="48%"}{fig-align="center" height="2.5in"}- **Without Data Validation:**::: fragment{fig-align="center" height="1.5in"}::::::::::::::## ## Adding Data Validation::::::: columns:::: {.column width="48%"}1. Click cell `G2`2. Select the `Data Validation` icon on the `Data` tab.::: fragment{fig-align="center"}:::3. Complete the pop-box as shown and click `OK`.- Now cell G2 has a drop-down menu of correct values.::::::: {.column width="4%"}:::::: {.column width="48%"}{fig-align="center"}::::::::::## ### Another Data Validation Example:::::: columns::: {.column width="48%"}- Worksheet 3 contains weather data. December 2023 to February 2024- We want to create a query that will output a value when the user specifies a date AND a weather variable.- Our first `VLOOKUP` command has to be manually changed if we want to change the weather variable, e.g from min. temp to snowfall.- Then we can automate this change with an embedded `MATCH` command.:::::: {.column width="4%"}:::::: {.column width="48%"}{fig-align="center"}{fig-align="center"}:::::::::## :::::: columns::: {.column width="48%"}### Data Validation- Next steps are to add data validation to cells H2 and H3- We want a pull-down menu in H2 so that only one of the dates can be selected.- We want a pull-down menu in H3 so that only one of the weather variables can be chosen.:::::: {.column width="4%"}:::::: {.column width="48%"}**Cell H2 Data Validation:**{fig-align="center"}**Cell H3 Data Validation:**{fig-align="center"}:::::::::## ### Lecture 6 In-class Exercise - Q4:::::: columns::: {.column width="48%"}If we want to know the snowfall for different dates, what would change in our `VLOOKUP` command?**HINT:** The third input before `FALSE` is the only input that needs to change.**A.** It would change from 3 to 1.**B.** It would change from 3 to 2.**C.** It would change from 3 to 4.**D.** It would change from 3 to 5.:::::: {.column width="4%"}:::::: {.column width="48%"}{fig-align="center"}**How do we correct the incorrect VLOOKUP below?**{fig-align="center"}:::::::::## ### Match Command- We can automate the column choice by adding a `MATCH` command::: fragment**How do we AUTOMATE the lookup column in the VLOOKUP below?**{fig-align="center"}:::- In place of the third input we embed a `MATCH` command.::: fragment**Replace third input with `MATCH` command:**{fig-align="center"}:::- **H3**: Input value we want to match- **B1:E1**: Array of cells to search for input value- **0**: Type of match. 0 indicates **EXACT MATCH**## ### VLOOKUP with Range Option- We begin this example by creating a range lookup table.:::::::: columns:::: {.column width="48%"}- Here are the ranges for Hurricane categories.::: fragment{fig-align="center"}:::- We need to convert these to ranges that Excel can understand.- Range lookup will only work with lower limits in ASCENDING ORDER.::::::: {.column width="4%"}::::::: {.column width="48%"}- Here are the CORRECT inputs for a Range Lookup table for Hurricane categories.::: fragment{fig-align="center"}:::- Update the table in columns D and E on Worksheet 4.::::::::::::## ### Putting all the Skills Together{fig-align="center"}1. In Cell `B1`, we will add data validation so that we have a pull-down list of Hurricane names.2. In Cell `B2`, we will create a `VLOOKUP` command that outputs the maximum wind speed.3. In cell `B3` we will create a `VLOOKUP` with a range option that will output the category of the hurricaine.## ### Lecture 6 In-class Exercise - Q5<br>Based on the completed query table, what category was Hurricane Rina?<br>Enter 1, 2, 3, 4, or 5.## ### Key Points from Today- **Queries using VLOOKUP command** - Useful for creating automated searches - VLOOKUP very useful for LARGE data sets- **VLOOKUP Query with ‘Data Validation’** - Prevents user error due to typos or complex text (accents, etc.)- **VLOOKUP with MATCH command** - Enables creation of more complex automated queries- **VLOOKUP with RANGE lookup** - Outputs answers by determining interval that includes reference value entry::: fragment**HW 3 is due 2/3/2025****To submit an Engagement Question or Comment about material from Lecture 6:** Submit it by midnight today (day of lecture).:::