FSAR | Lab #1 - Exercises

Descriptive Statistics

How many CEOs are in the sample? Hint: each row corresponds to a CEO, so you can use the functions summarize() and n().
How many CEOs have a graduate degree? Hint: you can use the function filter().
What is the percentage of CEOs with a graduate degree? Hint: you can use the functions summarize(), sum() and n().
What is the average CEO salary? Hint: you can use the functions summarize() and mean().
What is the mean CEO salary for those with a graduate degree? Hint: you can use the functions filter(), summarize() and mean().
What is the mean CEO salary for those without a graduate degree? Hint: you can use the same functions.
How many CEOs have/don’t have a college degree? Hint: you can use the functions group_by(), summarize() and n().
How many CEOs have/don’t have a college degree and a graduate degree? Hint: you can use the same functions.
Compute the mean, standard deviation, minimum, maximum and median of salary. Hint: you can use the functions summarize(), mean(), sd(), min(), max() and median().
Compute the mean, standard deviation, minimum, maximum and median of salary for CEOs with/without a college and graduate degree. Hint: you can use the same functions with group_by().

Get started by loading libraries and reading data.

library(tidyverse)

tb.ceosal2 <- read_delim("data/ceosal2.csv", delim= ",") 

# or
tb.ceosal2 <- read_csv("data/ceosal2.csv")

How many CEOs are in the sample?

tb.ceosal2 %>% summarize(n_ceo = n())

## # A tibble: 1 × 1
##   n_ceo
##   <int>
## 1   177

# or
nrow(tb.ceosal2)

## [1] 177

How many CEOs have a graduate degree?

# Whenever a CEO has a graduate degree the variable grad takes 
# the value of 1, and 0 otherwise. 

# We can count the number of rows where the variable grad takes the value of 1.
tb.ceosal2 %>% filter(grad == 1) %>% summarize(n_ceo = n())

## # A tibble: 1 × 1
##   n_ceo
##   <int>
## 1    94

# Alternatively, due to the binary nature of grad, we can count the number of 
# CEOs who have a graduate degree by summing the variable grad.
tb.ceosal2 %>% summarize(n_ceo = sum(grad))

## # A tibble: 1 × 1
##   n_ceo
##   <dbl>
## 1    94

What is the percentage of CEOs with a graduate degree?

tb.ceosal2 %>% summarize(p_ceo = sum(grad)/n())

## # A tibble: 1 × 1
##   p_ceo
##   <dbl>
## 1 0.531

# another alternative
tb.ceosal2 %>% summarize(p_ceo = mean(grad))

## # A tibble: 1 × 1
##   p_ceo
##   <dbl>
## 1 0.531

What is the average CEO salary?

tb.ceosal2 %>% summarize(avg_salary = mean(salary))

## # A tibble: 1 × 1
##   avg_salary
##        <dbl>
## 1       866.

What is the mean CEO salary for those with a graduate degree?

tb.ceosal2 %>% filter(grad == 1) %>% 
  summarize(avg_salary = mean(salary))

## # A tibble: 1 × 1
##   avg_salary
##        <dbl>
## 1       864.

What is the mean CEO salary for those without a graduate degree?

tb.ceosal2 %>% 
  filter(grad == 0) %>% 
  summarize(avg_salary = mean(salary))

## # A tibble: 1 × 1
##   avg_salary
##        <dbl>
## 1       868.

How can you answer the two previous questions (5 and 6) in one line?

tb.ceosal2 %>% group_by(grad) %>% summarize(avg_salary = mean(salary))

## # A tibble: 2 × 2
##    grad avg_salary
##   <dbl>      <dbl>
## 1     0       868.
## 2     1       864.

How many CEOs have/don’t have a college degree?

tb.ceosal2 %>% group_by(college) %>% summarize(n_ceo = n())

## # A tibble: 2 × 2
##   college n_ceo
##     <dbl> <int>
## 1       0     5
## 2       1   172

# another alternative
tb.ceosal2 %>% select(college) %>% table()

## college
##   0   1 
##   5 172

How many CEOs have/don’t have a college degree AND a graduate degree?

tb.ceosal2 %>% group_by(college, grad) %>% summarize(n_ceo = n())

## `summarise()` has grouped output by 'college'. You can override using the
## `.groups` argument.

## # A tibble: 3 × 3
## # Groups:   college [2]
##   college  grad n_ceo
##     <dbl> <dbl> <int>
## 1       0     0     5
## 2       1     0    78
## 3       1     1    94

Compute the mean, standard deviation, minimum, maximum and median of salary.

tb.ceosal2 %>% summarize(mean_salary = mean(salary),
                      sd_salary = sd(salary),
                      min_salary = min(salary), 
                      max_salary = max(salary),
                      median_salary = median(salary))

## # A tibble: 1 × 5
##   mean_salary sd_salary min_salary max_salary median_salary
##         <dbl>     <dbl>      <dbl>      <dbl>         <dbl>
## 1        866.      588.        100       5299           707

Compute the mean, standard deviation, minimum, maximum and median of salary for CEOs with/without a college and graduate degree.

tb.ceosal2 %>%
  group_by (grad,college) %>% 
  summarize(mean_salary = mean(salary),
            sd_salary = sd(salary), 
            min_salary = min(salary), 
            max_salary = max(salary),
            median_salary = median(salary))

## # A tibble: 3 × 7
## # Groups:   grad [2]
##    grad college mean_salary sd_salary min_salary max_salary median_salary
##   <dbl>   <dbl>       <dbl>     <dbl>      <dbl>      <dbl>         <dbl>
## 1     0       0       1096.      633.        300       1738         1143 
## 2     0       1        853.      679.        174       5299          708.
## 3     1       1        864.      501.        100       2265          706.

FSAR | Lab #1 - Exercises | Fall 2025

5th September, 2025