Accessing Datasets from R-Studio Packages ###########################################

#1.a - Open the Datasets Package in R-Studio

library(datasets)

#1.b - Load dataset that is called sleep from the Datasets Package

sleep
#Note: the dataset is called sleep

#1.c - Save the original sleep dataset into R as an object & make a copy that we will work with

copy <- sleep
trust <- copy

#Hint: See Section where it says Good Practice/General Recommendation in the example video

Characteristics/Descriptions of the Sleep Dataset Part 1 ###################################################

2.a. How many rows are there in the sleep dataset?

nrow(trust)
[1] 20

2.b How many columns are there in the sleep dataset?

ncol(trust)
[1] 3

2.c What are the name of columns in the sleep dataset?

colnames(trust)
[1] "extra" "group" "ID"   

2.d What type of variables are the 3 variables in the sleep dataset?

str(trust)
'data.frame':   20 obs. of  3 variables:
 $ extra: num  0.7 -1.6 -0.2 -1.2 -0.1 3.4 3.7 0.8 0 2 ...
 $ group: Factor w/ 2 levels "1","2": 1 1 1 1 1 1 1 1 1 1 ...
 $ ID   : Factor w/ 10 levels "1","2","3","4",..: 1 2 3 4 5 6 7 8 9 10 ...
#There is one numerical variable and two factor variables 

summary(trust)
     extra        group        ID   
 Min.   :-1.600   1:10   1      :2  
 1st Qu.:-0.025   2:10   2      :2  
 Median : 0.950          3      :2  
 Mean   : 1.540          4      :2  
 3rd Qu.: 3.400          5      :2  
 Max.   : 5.500          6      :2  
                         (Other):8  

Characteristics/Descriptions of the Sleep Dataset PART 2 ###################################################

3.a Run the head() & tail() function on the Sleep Dataset. What do you see when you run these functions?

head(trust)
tail(trust)

#The head function shows us the first 6 rows of the dataset. The tail function shows us the last 6 rows.

3.b Run the summary() function on the Sleep Dataset. What do you see when you run these functions?

summary(trust)
     extra        group        ID   
 Min.   :-1.600   1:10   1      :2  
 1st Qu.:-0.025   2:10   2      :2  
 Median : 0.950          3      :2  
 Mean   : 1.540          4      :2  
 3rd Qu.: 3.400          5      :2  
 Max.   : 5.500          6      :2  
                         (Other):8  

Characteristics/Descriptions of the Sleep Dataset PART 3 ###################################################

4.a Run the describe() function on the sleep dataset. What does this function tell you that the structure() and summary() functions do not? You can ignore the column headings for things that we have not learned yet.

#install.packages("psych")
library(psych)
describe(trust)

#Note: Make sure to open up proper library first to access describe() & describe.by() function

4.b Run the describe.by() function on the sleep dataset for the grouping variable titled group. What does this output tell you? Why are there multiple outputs here?

describe.by(sleep)
Warning: describe.by is deprecated.  Please use the describeBy function
Warning in describeBy(x = x, group = group, mat = mat, type = type, ...) :
  no grouping variable requested
#

#Note: Grouping variable is called group

Opening Datasets from Local Files ###################################

5.a Open both of the Stat 200 datasets listed on the course website under the folder titled R-Studio documents.

#Excel Version
#install.packages("readxl")
library(readxl)
read_xlsx("/Users/sscoli/Downloads/STAT200 Data.xlsx")

#CSV Version
#install.packages("readr")
library(readr)
read_csv("/Users/sscoli/Downloads/STAT200 Data.csv")
Rows: 93 Columns: 6
-- Column specification --------------
Delimiter: ","
dbl (6): Exam Scores (1-100), Hour...

i Use `spec()` to retrieve the full column specification for this data.
i Specify the column types or set `show_col_types = FALSE` to quiet this message.

5.b Save the original Stat 200 datasets into R as an object & make a copy that we will work with. Do so for both the xlsx and csv datasets.

Excel <- read_xlsx("/Users/sscoli/Downloads/STAT200 Data.xlsx")
CSV <- read_csv("/Users/sscoli/Downloads/STAT200 Data.csv")
Rows: 93 Columns: 6
-- Column specification --------------
Delimiter: ","
dbl (6): Exam Scores (1-100), Hour...

i Use `spec()` to retrieve the full column specification for this data.
i Specify the column types or set `show_col_types = FALSE` to quiet this message.
(Excel <- Excel)
(CSV <- CSV)
(Excel <- Excel)
(CSV <- CSV)
LS0tDQp0aXRsZTogIlIgTm90ZWJvb2siDQpvdXRwdXQ6IGh0bWxfbm90ZWJvb2sNCi0tLQ0KDQojIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjDQoqQWNjZXNzaW5nIERhdGFzZXRzIGZyb20gUi1TdHVkaW8gUGFja2FnZXMqDQojIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjDQoNCg0KIzEuYSAtIE9wZW4gdGhlIERhdGFzZXRzIFBhY2thZ2UgaW4gUi1TdHVkaW8NCmBgYHtyfQ0KbGlicmFyeShkYXRhc2V0cykNCmBgYA0KDQojMS5iIC0gTG9hZCBkYXRhc2V0IHRoYXQgaXMgY2FsbGVkIHNsZWVwIGZyb20gdGhlIERhdGFzZXRzIFBhY2thZ2UNCmBgYHtyfQ0Kc2xlZXANCiNOb3RlOiB0aGUgZGF0YXNldCBpcyBjYWxsZWQgc2xlZXANCg0KYGBgDQoNCiMxLmMgLSBTYXZlIHRoZSBvcmlnaW5hbCBzbGVlcCBkYXRhc2V0IGludG8gUiBhcyBhbiBvYmplY3QgJiBtYWtlIGEgY29weSB0aGF0IHdlIHdpbGwgd29yayB3aXRoDQpgYGB7cn0NCmNvcHkgPC0gc2xlZXANCnRydXN0IDwtIGNvcHkNCg0KI0hpbnQ6IFNlZSBTZWN0aW9uIHdoZXJlIGl0IHNheXMgR29vZCBQcmFjdGljZS9HZW5lcmFsIFJlY29tbWVuZGF0aW9uIGluIHRoZSBleGFtcGxlIHZpZGVvDQpgYGANCg0KDQojIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMNCipDaGFyYWN0ZXJpc3RpY3MvRGVzY3JpcHRpb25zIG9mIHRoZSBTbGVlcCBEYXRhc2V0Kg0KKlBhcnQgMSoNCiMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIw0KDQoyLmEuIEhvdyBtYW55IHJvd3MgYXJlIHRoZXJlIGluIHRoZSBzbGVlcCBkYXRhc2V0Pw0KYGBge3J9DQpucm93KHRydXN0KQ0KDQojVGhlcmUgYXJlIDIwIHJvd3MNCmBgYA0KDQoNCjIuYiBIb3cgbWFueSBjb2x1bW5zIGFyZSB0aGVyZSBpbiB0aGUgc2xlZXAgZGF0YXNldD8NCmBgYHtyfQ0KbmNvbCh0cnVzdCkNCg0KI1RoZXJlIGFyZSB0aHJlZSBjb2x1bW5zDQpgYGANCg0KMi5jIFdoYXQgYXJlIHRoZSBuYW1lIG9mIGNvbHVtbnMgaW4gdGhlIHNsZWVwIGRhdGFzZXQ/DQpgYGB7cn0NCmNvbG5hbWVzKHRydXN0KQ0KDQojVGhlIGNvbHVtbiBuYW1lcyBhcmUgZXh0cmEsIGdyb3VwLCBhbmQgSUQNCmBgYA0KDQoNCjIuZCBXaGF0IHR5cGUgb2YgdmFyaWFibGVzIGFyZSB0aGUgMyB2YXJpYWJsZXMgaW4gdGhlIHNsZWVwIGRhdGFzZXQ/DQpgYGB7cn0NCnN0cih0cnVzdCkNCg0KI1RoZXJlIGlzIG9uZSBudW1lcmljYWwgdmFyaWFibGUgYW5kIHR3byBmYWN0b3IgdmFyaWFibGVzIA0KDQpzdW1tYXJ5KHRydXN0KQ0KYGBgDQoNCg0KIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjDQoqQ2hhcmFjdGVyaXN0aWNzL0Rlc2NyaXB0aW9ucyBvZiB0aGUgU2xlZXAgRGF0YXNldCoNCipQQVJUIDIqDQojIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMNCg0KMy5hIFJ1biB0aGUgaGVhZCgpICYgdGFpbCgpIGZ1bmN0aW9uIG9uIHRoZSBTbGVlcCBEYXRhc2V0LiBXaGF0IGRvIHlvdSBzZWUgd2hlbiB5b3UgcnVuIHRoZXNlIGZ1bmN0aW9ucz8NCmBgYHtyfQ0KaGVhZCh0cnVzdCkNCnRhaWwodHJ1c3QpDQpgYGANCiNUaGUgaGVhZCBmdW5jdGlvbiBzaG93cyB1cyB0aGUgZmlyc3QgNiByb3dzIG9mIHRoZSBkYXRhc2V0LiBUaGUgdGFpbCBmdW5jdGlvbiBzaG93cyB1cyB0aGUgbGFzdCA2IHJvd3MuIA0KDQozLmIgUnVuIHRoZSBzdW1tYXJ5KCkgZnVuY3Rpb24gb24gdGhlIFNsZWVwIERhdGFzZXQuIFdoYXQgZG8geW91IHNlZSB3aGVuIHlvdSBydW4gdGhlc2UgZnVuY3Rpb25zPw0KYGBge3J9DQpzdW1tYXJ5KHRydXN0KQ0KDQojRm9yIG51bWVyaWNhbCB2YXJpYWJsZXMsIHdlIGdldCBkZXNjcmlwdGl2ZSBzdGF0cywgbGlrZSBtZWFucyBhbmQgcXVhcnRpbGVzIGFuZCB0aGUgbWluIGFuZCBtYXguIEZvciBmYWN0b3IgdmFyaWFibGVzLCB3ZSBzZWUgaG93IG1hbnkgaW5kaXZpZHVhbHMvcGFydGljaXBhbnRzIGFyZSBpbiBlYWNoIGxldmVsL2dyb3VwDQpgYGANCg0KDQoNCiMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIw0KKkNoYXJhY3RlcmlzdGljcy9EZXNjcmlwdGlvbnMgb2YgdGhlIFNsZWVwIERhdGFzZXQqDQoqUEFSVCAzKg0KIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjDQoNCjQuYSBSdW4gdGhlIGRlc2NyaWJlKCkgZnVuY3Rpb24gb24gdGhlIHNsZWVwIGRhdGFzZXQuIFdoYXQgZG9lcyB0aGlzIGZ1bmN0aW9uIHRlbGwgeW91IHRoYXQgdGhlIHN0cnVjdHVyZSgpIGFuZCBzdW1tYXJ5KCkgZnVuY3Rpb25zIGRvIG5vdD8gWW91IGNhbiBpZ25vcmUgdGhlIGNvbHVtbiBoZWFkaW5ncyBmb3IgdGhpbmdzIHRoYXQgd2UgaGF2ZSBub3QgbGVhcm5lZCB5ZXQuDQoNCmBgYHtyfQ0KI2luc3RhbGwucGFja2FnZXMoInBzeWNoIikNCmxpYnJhcnkocHN5Y2gpDQpkZXNjcmliZSh0cnVzdCkNCg0KI1RoaXMgZnVuY3Rpb24gdGVsbHMgeW91IHRoZSBkZXNjcmlwdGlvbiBvZiBzbGVlcCBkYXRhIHN1Y2ggYXMgdGhlIG1lYW4sIHNkLCBhbmQgbWVkaWFuIA0KDQojTm90ZTogTWFrZSBzdXJlIHRvIG9wZW4gdXAgcHJvcGVyIGxpYnJhcnkgZmlyc3QgdG8gYWNjZXNzIGRlc2NyaWJlKCkgJiBkZXNjcmliZS5ieSgpIGZ1bmN0aW9uDQpgYGANCg0KNC5iIFJ1biB0aGUgZGVzY3JpYmUuYnkoKSBmdW5jdGlvbiBvbiB0aGUgc2xlZXAgZGF0YXNldCBmb3IgdGhlIGdyb3VwaW5nIHZhcmlhYmxlIHRpdGxlZCBncm91cC4gV2hhdCBkb2VzIHRoaXMgb3V0cHV0IHRlbGwgeW91PyBXaHkgYXJlIHRoZXJlIG11bHRpcGxlIG91dHB1dHMgaGVyZT8NCmBgYHtyfQ0KZGVzY3JpYmUuYnkoc2xlZXApDQoNCiNBZnRlciBydW5uaW5nIHRoZSBkZXNjcmliZSBmdW5jdGlvbiBhbmQgdGhlIGRlc2NyaWJlLmJ5IGZ1bmN0aW9uIG9uIHRoZSBzbGVlcCBkYXRhc2V0IHRoZXkgYm90aCBoYXZlIHRoZSBzYW1lIG91dHB1dC4gVGhlcmUgYXJlIG11bHRpcGxlIG91dHB1dHMgaGVyZSBiZWNhdXNlIGl0IGlzIGltcG9ydGFudCB0byBtYWtlIHN1cmUgdGhlIGRhdGEgaXMgYWNjdXJhdGUuIA0KDQojTm90ZTogR3JvdXBpbmcgdmFyaWFibGUgaXMgY2FsbGVkIGdyb3VwDQpgYGANCg0KIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMNCipPcGVuaW5nIERhdGFzZXRzIGZyb20gTG9jYWwgRmlsZXMqDQojIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIw0KDQo1LmEgT3BlbiBib3RoIG9mIHRoZSBTdGF0IDIwMCBkYXRhc2V0cyBsaXN0ZWQgb24gdGhlIGNvdXJzZSB3ZWJzaXRlIHVuZGVyIHRoZSBmb2xkZXIgdGl0bGVkIFItU3R1ZGlvIGRvY3VtZW50cy4NCmBgYHtyfQ0KI0V4Y2VsIFZlcnNpb24NCiNpbnN0YWxsLnBhY2thZ2VzKCJyZWFkeGwiKQ0KbGlicmFyeShyZWFkeGwpDQpyZWFkX3hsc3goIi9Vc2Vycy9zc2NvbGkvRG93bmxvYWRzL1NUQVQyMDAgRGF0YS54bHN4IikNCg0KI0NTViBWZXJzaW9uDQojaW5zdGFsbC5wYWNrYWdlcygicmVhZHIiKQ0KbGlicmFyeShyZWFkcikNCnJlYWRfY3N2KCIvVXNlcnMvc3Njb2xpL0Rvd25sb2Fkcy9TVEFUMjAwIERhdGEuY3N2IikNCmBgYA0KDQo1LmIgU2F2ZSB0aGUgb3JpZ2luYWwgU3RhdCAyMDAgZGF0YXNldHMgaW50byBSIGFzIGFuIG9iamVjdCAmIG1ha2UgYSBjb3B5IHRoYXQgd2Ugd2lsbCB3b3JrIHdpdGguIERvIHNvIGZvciBib3RoIHRoZSB4bHN4IGFuZCBjc3YgZGF0YXNldHMuDQpgYGB7cn0NCkV4Y2VsIDwtIHJlYWRfeGxzeCgiL1VzZXJzL3NzY29saS9Eb3dubG9hZHMvU1RBVDIwMCBEYXRhLnhsc3giKQ0KQ1NWIDwtIHJlYWRfY3N2KCIvVXNlcnMvc3Njb2xpL0Rvd25sb2Fkcy9TVEFUMjAwIERhdGEuY3N2IikNCg0KKEV4Y2VsIDwtIEV4Y2VsKQ0KKENTViA8LSBDU1YpDQpgYGANCg0K