### Note: I tried to read_excel. Couldn't do it. I tried install.packages("read_excel"). The message I got was that package couldn't be loaded on this version of R.
### I gave up, converted the Excel file to .csv and tried doing the exercise using the .csv file. 
### Warning in install.packages :
 ### package ‘read_excel’ is not available for this version of R

### A version of this package for your version of R might be available elsewhere,
### see the ideas at
### https://cran.r-project.org/doc/manuals/r-patched/R-admin.html#Installing-packages
getwd()
## [1] "C:/Users/Jerome/Documents/0000_Work_Files/0000_Montgomery_College/Data_Science_101/Data_101_Fall_2022/Homework_6_Due_17Oct2022"
library(readr)
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.3.6      ✔ dplyr   1.0.10
## ✔ tibble  3.1.8      ✔ stringr 1.4.1 
## ✔ tidyr   1.2.1      ✔ forcats 0.5.2 
## ✔ purrr   0.3.4      
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
library(dplyr)
mnms <- read_csv("mnm.data.csv")
## Rows: 382 Columns: 6
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (3): student_id, color, defect
## dbl (3): id, total, weight_grams
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
str(mnms)
## spec_tbl_df [382 × 6] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
##  $ student_id  : chr [1:382] "AP_LV" "AP_LV" "AP_LV" "AP_LV" ...
##  $ id          : num [1:382] 1 2 3 4 5 6 7 8 9 10 ...
##  $ color       : chr [1:382] "r" "r" "r" "r" ...
##  $ defect      : chr [1:382] "c" "l" "z" "z" ...
##  $ total       : num [1:382] 27 27 27 27 27 27 27 27 27 27 ...
##  $ weight_grams: num [1:382] 40 40 40 40 40 40 40 40 40 40 ...
##  - attr(*, "spec")=
##   .. cols(
##   ..   student_id = col_character(),
##   ..   id = col_double(),
##   ..   color = col_character(),
##   ..   defect = col_character(),
##   ..   total = col_double(),
##   ..   weight_grams = col_double()
##   .. )
##  - attr(*, "problems")=<externalptr>
summary(mnms)
##   student_id              id          color              defect         
##  Length:382         Min.   : 1.0   Length:382         Length:382        
##  Class :character   1st Qu.:10.0   Class :character   Class :character  
##  Mode  :character   Median :20.0   Mode  :character   Mode  :character  
##                     Mean   :22.8                                        
##                     3rd Qu.:35.0                                        
##                     Max.   :55.0                                        
##      total        weight_grams  
##  Min.   :18.00   Min.   :25.00  
##  1st Qu.:27.00   1st Qu.:40.00  
##  Median :54.00   Median :48.00  
##  Mean   :45.01   Mean   :44.25  
##  3rd Qu.:55.00   3rd Qu.:50.00  
##  Max.   :56.00   Max.   :50.00
summarize(mnms)
## # A tibble: 1 × 0
mean(mnms$weight_grams, na.rm = TRUE)
## [1] 44.24607
sd(mnms$weight_grams, na.rm = T)
## [1] 6.59851
boxplot(mnms$weight_grams)

hist(mnms$weight_grams)

##install.packages("moments") I had to comment this to get knit to run. 
library(moments)
skewness(mnms$weight_grams)
## [1] -1.367271
kurtosis(mnms$weight_grams)
## [1] 4.854292
table(mnms$color, mnms$defect)
##     
##       c  l  m  z
##   bl  7 13  4 58
##   br  2  3  0 51
##   g   5 10  1 35
##   o   6  8  2 71
##   r   6  4  1 36
##   y   7  4  1 47
### Defect Code  Z means no defects. Which color has the fewest defects?
table(mnms$color)
## 
## bl br  g  o  r  y 
## 82 56 51 87 47 59
### Calculate percentages 
58/82
## [1] 0.7073171
51/56
## [1] 0.9107143
35/51
## [1] 0.6862745
71/87
## [1] 0.816092
36/47
## [1] 0.7659574
47/59
## [1] 0.7966102
### Brown has the highest percentage of non-defect M & Ms. Green has the lowest percentage of non-defect M & Ms.