EDA Question: Is there a difference between salary of all workers based on different genders and industries?

Data structure

## Parsed with column specification:
## cols(
##   Occupation = col_character(),
##   Industry = col_character(),
##   All_workers = col_double(),
##   All_weekly = col_double(),
##   M_workers = col_double(),
##   M_weekly = col_double(),
##   F_workers = col_double(),
##   F_weekly = col_double()
## )
## Classes 'spec_tbl_df', 'tbl_df', 'tbl' and 'data.frame': 535 obs. of  8 variables:
##  $ Occupation : chr  "Chief executives" "General and operations managers" "Legislators" "Advertising and promotions managers" ...
##  $ Industry   : chr  "Management" "Management" "Management" "Management" ...
##  $ All_workers: num  1046 823 8 55 948 ...
##  $ All_weekly : num  2041 1260 NA 1050 1462 ...
##  $ M_workers  : num  763 621 5 29 570 24 96 466 551 7 ...
##  $ M_weekly   : num  2251 1347 NA NA 1603 ...
##  $ F_workers  : num  283 202 4 26 378 35 73 169 573 16 ...
##  $ F_weekly   : num  1836 1002 NA NA 1258 ...
##  - attr(*, "spec")=
##   .. cols(
##   ..   Occupation = col_character(),
##   ..   Industry = col_character(),
##   ..   All_workers = col_double(),
##   ..   All_weekly = col_double(),
##   ..   M_workers = col_double(),
##   ..   M_weekly = col_double(),
##   ..   F_workers = col_double(),
##   ..   F_weekly = col_double()
##   .. )
##   Occupation          Industry          All_workers       All_weekly    
##  Length:535         Length:535         Min.   :   0.0   Min.   : 354.0  
##  Class :character   Class :character   1st Qu.:  20.0   1st Qu.: 623.0  
##  Mode  :character   Mode  :character   Median :  63.0   Median : 859.0  
##                                        Mean   : 203.9   Mean   : 912.9  
##                                        3rd Qu.: 190.0   3rd Qu.:1122.5  
##                                        Max.   :2806.0   Max.   :2041.0  
##                                                         NA's   :236     
##    M_workers         M_weekly      F_workers          F_weekly     
##  Min.   :   0.0   Min.   : 389   Min.   :   0.00   Min.   : 380.0  
##  1st Qu.:  11.0   1st Qu.: 678   1st Qu.:   2.50   1st Qu.: 544.0  
##  Median :  32.0   Median : 924   Median :  16.00   Median : 737.0  
##  Mean   : 113.5   Mean   :1006   Mean   :  90.31   Mean   : 809.4  
##  3rd Qu.: 100.5   3rd Qu.:1264   3rd Qu.:  68.50   3rd Qu.: 986.0  
##  Max.   :2582.0   Max.   :2251   Max.   :2262.00   Max.   :1836.0  
##                   NA's   :326                      NA's   :366
## 
##            Agricultural                    Arts                Business 
##                       9                      19                      28 
##           Computational            Construction                Culinary 
##                      16                      40                      13 
##               Education             Engineering          Groundskeeping 
##                      11                      21                       6 
## Healthcare Professional      Healthcare Support                   Legal 
##                      33                      11                       5 
##             Maintenance              Management                  Office 
##                      37                      30                      52 
##              Production      Protective Service                   Sales 
##                      81                      18                      18 
##                 Science                 Service          Social Service 
##                      23                      20                       8 
##          Transportation 
##                      36

Number of total missing data

## [1] 928

Number of Observations

## [1] 535

Visual exploration with different plots

## Registered S3 methods overwritten by 'ggplot2':
##   method         from 
##   [.quosures     rlang
##   c.quosures     rlang
##   print.quosures rlang

Summary

From the box plots we can see the difference in weekly income ampong different genders, overall male weekly salary is higher than female. From the scatter plot, same occupation male weekly salary is higher than female.