beavers <- read.csv(file="https://raw.githubusercontent.com/vincentarelbundock/Rdatasets/master/csv/boot/beaver.csv", header=TRUE, sep=",")
  1. Use the summary function to gain an overview of the data set. Then display the mean and median for at least two attributes.
summary(beavers)
##        X               day             time           temp      
##  Min.   :  1.00   Min.   :307.0   Min.   :   0   Min.   :36.58  
##  1st Qu.: 25.75   1st Qu.:307.0   1st Qu.:1128   1st Qu.:37.15  
##  Median : 50.50   Median :307.0   Median :1535   Median :37.73  
##  Mean   : 50.50   Mean   :307.1   Mean   :1446   Mean   :37.60  
##  3rd Qu.: 75.25   3rd Qu.:307.0   3rd Qu.:1942   3rd Qu.:37.98  
##  Max.   :100.00   Max.   :308.0   Max.   :2350   Max.   :38.35  
##      activ     
##  Min.   :0.00  
##  1st Qu.:0.00  
##  Median :1.00  
##  Mean   :0.62  
##  3rd Qu.:1.00  
##  Max.   :1.00
mean(beavers$day)
## [1] 307.13
mean(beavers$time)
## [1] 1446.2
median(beavers$temp)
## [1] 37.735
median(beavers$temp)
## [1] 37.735
  1. Create a new data frame with a subset of the columns and rows. Make sure to rename it.
beavers.subset<-beavers[c(1:10,20,30), c(1:5, 1)]
  1. Create new column names for the new data frame.
beavers.subset$SL.type<-ifelse(beavers.subset$temp>37,"High Temp","Low Temp")
  1. Use the summary function to create an overview of your new data frame. The print the mean and median for the same two attributes. Please compare.
summary(beavers.subset)
##        X              day           time             temp      
##  Min.   : 1.00   Min.   :307   Min.   : 930.0   Min.   :36.58  
##  1st Qu.: 3.75   1st Qu.:307   1st Qu.: 987.5   1st Qu.:36.90  
##  Median : 6.50   Median :307   Median :1025.0   Median :37.01  
##  Mean   : 8.75   Mean   :307   Mean   :1060.8   Mean   :37.00  
##  3rd Qu.: 9.25   3rd Qu.:307   3rd Qu.:1062.5   3rd Qu.:37.17  
##  Max.   :30.00   Max.   :307   Max.   :1420.0   Max.   :37.24  
##      activ        X.1          SL.type         
##  Min.   :0   Min.   : 1.00   Length:12         
##  1st Qu.:0   1st Qu.: 3.75   Class :character  
##  Median :0   Median : 6.50   Mode  :character  
##  Mean   :0   Mean   : 8.75                     
##  3rd Qu.:0   3rd Qu.: 9.25                     
##  Max.   :0   Max.   :30.00
mean(beavers.subset$temp)
## [1] 37.0025
mean(beavers.subset$time)
## [1] 1060.833
median(beavers.subset$temp)
## [1] 37.01
median(beavers.subset$time)
## [1] 1025
  1. For at least 3 values in a column please rename so that every value in that column is renamed. For example, suppose I have 20 values of the letter “e” in one column. Rename those values so that all 20 would show as “excellent”.
beavers.subset[beavers.subset$time=="evening","time"]<-"evening"
  1. Display enough rows to see examples of all of steps 1-5 above.
head(beavers.subset)
##   X day time  temp activ X.1   SL.type
## 1 1 307  930 36.58     0   1  Low Temp
## 2 2 307  940 36.73     0   2  Low Temp
## 3 3 307  950 36.93     0   3  Low Temp
## 4 4 307 1000 37.15     0   4 High Temp
## 5 5 307 1010 37.23     0   5 High Temp
## 6 6 307 1020 37.24     0   6 High Temp
    1. BONUS – place the original .csv in a github file and have R read from the link. This will be a very useful skill as you progress in your data science education and career.
# url<- "https://raw.githubusercontent.com/Sizzlo/Datasets/master/csv/boot/beaver.csv"
# beavers1<- read.csv(url)