Download the cv2010on.csv file from Moodle; save it in your computer; and upload it into the data folder of your RStudio. Revise the read.csv code line so that the code matches both the name and address of the data file. The data contains civil cases filed at the federal district courts in six New England States in 2011 and on. The row represents civil cases, and the column their characteristics. DEF stands for defendants; PLT plaintiffs; and nature_of_suit type of lawsuits. Change comics to civilCases.
36,643 cases
# Load dplyr package
library(dplyr) #for use of dplyr functions such as glimpse(), mutate(), and filter()
library(ggplot2) #for use of ggplot2 functions such ggplot()
# Import data
civilcases <- read.csv("/resources/rstudio/Business Statistics/data/cv2010on.csv")
# Convert data to tbl_df
civilcases <- tbl_df(civilcases)
str(civilcases)
## Classes 'tbl_df', 'tbl' and 'data.frame': 36643 obs. of 6 variables:
## $ DISTRICT : Factor w/ 6 levels "CT","MA","ME",..: 3 3 3 3 3 3 3 3 3 3 ...
## $ PLT : Factor w/ 19900 levels "-8",":WALKER EL: VENUS-ANTOINETTE",..: 6393 3300 5130 19442 7175 3482 6269 4384 12436 13162 ...
## $ DEF : Factor w/ 19496 levels "-8","'47 BRAND, LLC",..: 8018 11968 5576 10445 5251 14988 7759 1510 8210 13180 ...
## $ FILEYEAR : int 2011 2011 2011 2011 2011 2011 2011 2011 2011 2011 ...
## $ NOS : int 445 385 442 440 440 190 440 442 190 110 ...
## $ nature_of_suit: Factor w/ 44 levels "ADMINISTRATIVE PROCEDURE ACT/REVIEW OR APPEAL OF AGENCY\nDECISION",..: 7 37 9 28 28 29 28 9 29 19 ...
Revise the level code below so that R returns all levels (values) in the id variable.
CT MA ME NH RI VT
Revise the table code below so that R returns the answer for the question.
445 cases were filled
levels(civilcases$nature_of_suit)
## [1] "ADMINISTRATIVE PROCEDURE ACT/REVIEW OR APPEAL OF AGENCY\nDECISION"
## [2] "ANTITRUST"
## [3] "ARBITRATION"
## [4] "ASSAULT, LIBEL, AND SLANDER"
## [5] "BANKS AND BANKING"
## [6] "CIVIL RIGHTS ACCOMMODATIONS"
## [7] "CIVIL RIGHTS ADA EMPLOYMENT"
## [8] "CIVIL RIGHTS ADA OTHER"
## [9] "CIVIL RIGHTS JOBS"
## [10] "CONSUMER CREDIT"
## [11] "CONTRACT FRANCHISE"
## [12] "CONTRACT PRODUCT LIABILITY"
## [13] "COPYRIGHT"
## [14] "FAIR LABOR STANDARDS ACT"
## [15] "FALSE CLAIMS ACT"
## [16] "FAMILY AND MEDICAL LEAVE ACT"
## [17] "FOOD AND DRUG ACTS"
## [18] "HEALTH CARE / PHARM"
## [19] "INSURANCE"
## [20] "INTERSTATE COMMERCE"
## [21] "LABOR/MANAGEMENT RELATIONS ACT"
## [22] "LABOR/MANAGEMENT REPORT & DISCLOSURE"
## [23] "MEDICAL MALPRACTICE"
## [24] "MOTOR VEHICLE PERSONAL INJURY"
## [25] "MOTOR VEHICLE PRODUCT LIABILITY"
## [26] "NEGOTIABLE INSTRUMENTS"
## [27] "OCCUPATIONAL SAFETY/HEALTH"
## [28] "OTHER CIVIL RIGHTS"
## [29] "OTHER CONTRACT ACTIONS"
## [30] "OTHER FRAUD"
## [31] "OTHER LABOR LITIGATION"
## [32] "OTHER PERSONAL INJURY"
## [33] "OTHER PERSONAL PROPERTY DAMAGE"
## [34] "OTHER REAL PROPERTY ACTIONS"
## [35] "OTHER STATUTORY ACTIONS"
## [36] "PERSONAL INJURY -PRODUCT LIABILITY"
## [37] "PROPERTY DAMAGE -PRODUCT LIABILTY"
## [38] "RENT, LEASE, EJECTMENT"
## [39] "SECURITIES, COMMODITIES, EXCHANGE"
## [40] "STOCKHOLDER'S SUITS"
## [41] "TORT PRODUCT LIABILITY"
## [42] "TORTS TO LAND"
## [43] "TRADEMARK"
## [44] "TRUTH IN LENDING"
tab <- table(civilcases$DISTRICT, civilcases$DISTRICT)
tab
##
## CT MA ME NH RI VT
## CT 9718 0 0 0 0 0
## MA 0 18705 0 0 0 0
## ME 0 0 1988 0 0 0
## NH 0 0 0 2556 0 0
## RI 0 0 0 0 2628 0
## VT 0 0 0 0 0 1048
Revise the barchart code below to find the answer.
MA handles the largest number of civil cases
ggplot(civilcases, aes(x = DISTRICT)) +
geom_bar()
Map district to the x-axis and nature of suit to color.
Their is too much data for the graph to be visible that is why it dose not look right.
ggplot(civilcases, aes(x = DISTRICT, fill = nature_of_suit)) +
geom_bar(position = "fill") #position = "fill", to have a stacked barchart