Download the cv2010on.csv file from Moodle; save it in your computer; and upload it into the data folder of your RStudio. Revise the read.csv code line so that the code matches both the name and address of the data file. The data contains civil cases filed at the federal district courts in six New England States in 2011 and on. The row represents civil cases, and the column their characteristics. DEF stands for defendants; PLT plaintiffs; and nature_of_suit type of lawsuits.

Q1. What type of study would be possible with this data? Observational study or experiment?

Observational

Q2. How many civil cases have been filed at the U.S. District Courts in New England?

36643

Q3. Who is the most frequent defendant in New England?

Massachusetts

Q4. In 2011 Fisher filed a lawsuit against the town of Hermon? What type of lawsuit (nature of suit) was it?

Civil Rights and ADA employment

# Load packages
library(dplyr)

civilCases <- read.csv("/resources/rstudio/BusinessStatistics/Data/cv2010on.csv") 
civilCases$FILEYEAR <- as.factor(civilCases$FILEYEAR)
str(civilCases)
## 'data.frame':    36643 obs. of  6 variables:
##  $ DISTRICT      : Factor w/ 6 levels "CT","MA","ME",..: 3 3 3 3 3 3 3 3 3 3 ...
##  $ PLT           : Factor w/ 19900 levels "-8",":WALKER EL: VENUS-ANTOINETTE",..: 6393 3300 5130 19442 7175 3482 6269 4384 12436 13162 ...
##  $ DEF           : Factor w/ 19496 levels "-8","'47 BRAND, LLC",..: 8018 11968 5576 10445 5251 14988 7759 1510 8210 13180 ...
##  $ FILEYEAR      : Factor w/ 8 levels "2011","2012",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ NOS           : int  445 385 442 440 440 190 440 442 190 110 ...
##  $ nature_of_suit: Factor w/ 44 levels "ADMINISTRATIVE PROCEDURE ACT/REVIEW OR APPEAL OF AGENCY\nDECISION",..: 7 37 9 28 28 29 28 9 29 19 ...
summary(civilCases)
##  DISTRICT         PLT                                    DEF       
##  CT: 9718   SMITH   :  185   FRESENIUS MEDICAL CARE , ET AL: 3590  
##  MA:18705   SEALED  :  173   ATRIUM MEDICAL CORPORAT, ET AL:  617  
##  ME: 1988   BROWN   :  165   GLAXOSMITHKLINE LLC           :  439  
##  NH: 2556   JOHNSON :  154   FRESENIUS USA, INC., ET AL    :  379  
##  RI: 2628   WILLIAMS:  149   DAVOL, INC., ET AL            :  189  
##  VT: 1048   MARRADI :  146   BOSTON SCIENTIFIC CORP.       :  179  
##             (Other) :35671   (Other)                       :31250  
##     FILEYEAR         NOS                                   nature_of_suit 
##  2014   :6503   Min.   :110.0   HEALTH CARE / PHARM               : 5145  
##  2015   :5884   1st Qu.:360.0   OTHER CIVIL RIGHTS                : 4101  
##  2013   :4763   Median :367.0   OTHER CONTRACT ACTIONS            : 3945  
##  2017   :4505   Mean   :422.1   CIVIL RIGHTS JOBS                 : 3026  
##  2016   :4155   3rd Qu.:443.0   PERSONAL INJURY -PRODUCT LIABILITY: 2988  
##  2011   :4153   Max.   :899.0   OTHER PERSONAL INJURY             : 2035  
##  (Other):6680                   (Other)                           :15403
head(civilCases)
##   DISTRICT        PLT                          DEF FILEYEAR NOS
## 1       ME     FISHER              HERMON, TOWN OF     2011 445
## 2       ME CHANDONAIT          NAVISTAR INC, ET AL     2011 385
## 3       ME     DIONNE EDUCATION, MAINE DEPT, ET AL     2011 442
## 4       ME    WILKINS                MADORE, ET AL     2011 440
## 5       ME      GIBBS          DOROTHEA DIX, ET AL     2011 440
## 6       ME  CHRISTIAN   SCHNEIDER HOMES LLC, ET AL     2011 190
##                      nature_of_suit
## 1       CIVIL RIGHTS ADA EMPLOYMENT
## 2 PROPERTY DAMAGE -PRODUCT LIABILTY
## 3                 CIVIL RIGHTS JOBS
## 4                OTHER CIVIL RIGHTS
## 5                OTHER CIVIL RIGHTS
## 6            OTHER CONTRACT ACTIONS

Q5. What is the most frequent type of civil suit filed in New England?

Health Care/ Pharm

# Count number of male and female applicants admitted
civilCases %>%
  count(nature_of_suit) %>%
  arrange(desc(n)) # Sort the table by n in descending order
## # A tibble: 44 x 2
##    nature_of_suit                         n
##    <fct>                              <int>
##  1 HEALTH CARE / PHARM                 5145
##  2 OTHER CIVIL RIGHTS                  4101
##  3 OTHER CONTRACT ACTIONS              3945
##  4 CIVIL RIGHTS JOBS                   3026
##  5 PERSONAL INJURY -PRODUCT LIABILITY  2988
##  6 OTHER PERSONAL INJURY               2035
##  7 CONSUMER CREDIT                     2000
##  8 INSURANCE                           1741
##  9 OTHER STATUTORY ACTIONS             1482
## 10 FAIR LABOR STANDARDS ACT             954
## # ... with 34 more rows

Q6. Of all the civil cases filed at the U.S. District Court of New Hampshire, what percentage was about “HEALTH CARE / PHARM”?

17.4%

civilCases %>%
  count(DISTRICT, nature_of_suit) %>%
  # Group by gender
  group_by(DISTRICT) %>%
  # Create new variable
  mutate(prop = n / sum(n)) %>%
  # Filter for admitted
  filter(nature_of_suit == "FAIR LABOR STANDARDS ACT")
## # A tibble: 6 x 4
## # Groups:   DISTRICT [6]
##   DISTRICT nature_of_suit               n   prop
##   <fct>    <fct>                    <int>  <dbl>
## 1 CT       FAIR LABOR STANDARDS ACT   365 0.0376
## 2 MA       FAIR LABOR STANDARDS ACT   378 0.0202
## 3 ME       FAIR LABOR STANDARDS ACT    50 0.0252
## 4 NH       FAIR LABOR STANDARDS ACT    29 0.0113
## 5 RI       FAIR LABOR STANDARDS ACT   114 0.0434
## 6 VT       FAIR LABOR STANDARDS ACT    18 0.0172
civilCases %>%
  count(DISTRICT, nature_of_suit) %>%
  # Group by gender
  group_by(DISTRICT) %>%
  # Create new variable
  mutate(prop = n / sum(n)) %>%
  # Filter for admitted
  filter(nature_of_suit == "HEALTH CARE / PHARM")
## # A tibble: 6 x 4
## # Groups:   DISTRICT [6]
##   DISTRICT nature_of_suit          n    prop
##   <fct>    <fct>               <int>   <dbl>
## 1 CT       HEALTH CARE / PHARM    53 0.00545
## 2 MA       HEALTH CARE / PHARM  4493 0.240  
## 3 ME       HEALTH CARE / PHARM    72 0.0362 
## 4 NH       HEALTH CARE / PHARM   445 0.174  
## 5 RI       HEALTH CARE / PHARM    67 0.0255 
## 6 VT       HEALTH CARE / PHARM    15 0.0143

Q7. What type of sampling method is this?

Stratified Sampling