BUA 345 - Lecture 6

Data Queries

Author

Penelope Pooler Eisenbies

Published

January 29, 2025

Housekeeping

HW 3 is due 2/3/2025

Sign up for a FREE Posit Cloud Account before Tuesday, 2/4/25

Today’s plan

  • Data Set Queries

    • VLOOKUP

    • VLOOKUP with Data Validation

    • VLOOKUP with MATCH command

    • VLOOKUP with Range option

In-class Polling (Session ID: bua345s25)

Lecture 6 In-class Exercise - Q1-Q2

The Pivot Table below shows the COUNTS of Titanic passengers in each class that did or did not survive the disaster.

Question 1. If we want to know the percent of passengers that survived (Yes) that were in second class, what option should we select for Show Values As?

A. % of Column Total

B. % of Row Total

C. % of Grand Total


Question 2. If we want to know the percent of ALL passengers that were in third class and did not survive, what option should we select for Show Values As?

Data Set Queries

  • A useful skill is to be able to find specific observations quickly in a large database.

  • Excel facilitates this process with these commands:

    • VLOOKUP
    • MATCH
  • You are working for a Non-profit that offers celebrity autographs in exchange for donations, so you need to contact celebrities.

  • This first worksheet only has 4 observations, but what if it had 400?

  • You want to find a phone number quickly, just based on STAGE NAME.

  • We will create a VLOOKUP command to do this.

VLOOKUP Command

We will use VLOOKUP to automate a datebase search.

  • We enter Lady Gaga in the cell next to ‘STAGE NAME’

  • We create a VLOOKUP function to output correct phone number in cell next to ‘FAN PHONE NO.’

VLOOKUP inputs

  • G2: Where reference value is entered (see Worksheet 1)

  • B2:D5: Data range to be searched

    • Note: VLOOKUP requires that Column with reference value is FIRST COLUMN (Index = 1)

    • STAGE NAME (Column B) is first column

  • 3: Column index where query answer is.

    • FAN PHONE NO is 3rd column counting from STAGE NAME
  • FALSE: We enter FALSE to indicate we want an EXACT MATCH

VLOOKUP Example 2

  • Now you have a different database (Worksheet 2) where STAGE NAME is listed in COLUMN A.

  • Using this new database, we want to find Rihanna’s phone number

Updating the inputs

  • G2: Where reference value is entered (see Worksheet 1)

  • ??:???: Data range to be searched

    • Note: VLOOKUP requires that Column with reference value is FIRST COLUMN (Index = 1)

    • STAGE NAME (Column A) is first column

  • ?: Column index where query answer is.

    • FAN PHONE NO is 4th column counting from STAGE NAME
  • FALSE: We enter FALSE to indicate we want an EXACT MATCH

Lecture 6 In-class Exercise - Q3

We want to use

=VLOOKUP(G2, ?, ?, FALSE) to find Rihanna’s phone number in the database in Worksheet 2.

What should the missing inputs be?

HINT: There are two correct choices and either choice will receive full credit.

A. …, B2:D5, 3, …

B. …, A1:D5, 4, …

C. …, A2:D5, 3, …

D. …, A2:D5, 4, …

E. …, B1:D5, 3, …

VLOOKUP

Error without Data Validation

Using the Worksheet 2 Database, we want to find Beyonce’s phone number phone number, but there’s a common problem.

  • The accent in Beyonce’s name causes an error.

  • How do we deal with this?

  • We use ‘Data Validation’ to restrict options for G2 to ONLY VALID entries.

  • Without Data Validation:

Adding Data Validation

  1. Click cell G2

  2. Select the Data Validation icon on the Data tab.

  1. Complete the pop-box as shown and click OK.
  • Now cell G2 has a drop-down menu of correct values.

Another Data Validation Example

  • Worksheet 3 contains weather data. December 2023 to February 2024

  • We want to create a query that will output a value when the user specifies a date AND a weather variable.

  • Our first VLOOKUP command has to be manually changed if we want to change the weather variable, e.g from min. temp to snowfall.

  • Then we can automate this change with an embedded MATCH command.

Data Validation

  • Next steps are to add data validation to cells H2 and H3

  • We want a pull-down menu in H2 so that only one of the dates can be selected.

  • We want a pull-down menu in H3 so that only one of the weather variables can be chosen.

Cell H2 Data Validation:

Cell H3 Data Validation:

Lecture 6 In-class Exercise - Q4

If we want to know the snowfall for different dates, what would change in our VLOOKUP command?

HINT: The third input before FALSE is the only input that needs to change.

A. It would change from 3 to 1.

B. It would change from 3 to 2.

C. It would change from 3 to 4.

D. It would change from 3 to 5.

How do we correct the incorrect VLOOKUP below?

Match Command

  • We can automate the column choice by adding a MATCH command

How do we AUTOMATE the lookup column in the VLOOKUP below?

  • In place of the third input we embed a MATCH command.

Replace third input with MATCH command:

  • H3: Input value we want to match
  • B1:E1: Array of cells to search for input value
  • 0: Type of match. 0 indicates EXACT MATCH

VLOOKUP with Range Option

  • We begin this example by creating a range lookup table.
  • Here are the ranges for Hurricane categories.

  • We need to convert these to ranges that Excel can understand.

  • Range lookup will only work with lower limits in ASCENDING ORDER.

  • Here are the CORRECT inputs for a Range Lookup table for Hurricane categories.

  • Update the table in columns D and E on Worksheet 4.

Putting all the Skills Together

  1. In Cell B1, we will add data validation so that we have a pull-down list of Hurricane names.

  2. In Cell B2, we will create a VLOOKUP command that outputs the maximum wind speed.

  3. In cell B3 we will create a VLOOKUP with a range option that will output the category of the hurricaine.

Lecture 6 In-class Exercise - Q5


Based on the completed query table, what category was Hurricane Rina?


Enter 1, 2, 3, 4, or 5.

Key Points from Today

  • Queries using VLOOKUP command

    • Useful for creating automated searches

    • VLOOKUP very useful for LARGE data sets

  • VLOOKUP Query with ‘Data Validation’

    • Prevents user error due to typos or complex text (accents, etc.)
  • VLOOKUP with MATCH command

    • Enables creation of more complex automated queries
  • VLOOKUP with RANGE lookup

    • Outputs answers by determining interval that includes reference value entry

HW 3 is due 2/3/2025

To submit an Engagement Question or Comment about material from Lecture 6: Submit it by midnight today (day of lecture).