beavers <- read.csv(file="https://raw.githubusercontent.com/vincentarelbundock/Rdatasets/master/csv/boot/beaver.csv", header=TRUE, sep=",")
- Use the summary function to gain an overview of the data set. Then display the mean and median for at least two attributes.
summary(beavers)
## X day time temp
## Min. : 1.00 Min. :307.0 Min. : 0 Min. :36.58
## 1st Qu.: 25.75 1st Qu.:307.0 1st Qu.:1128 1st Qu.:37.15
## Median : 50.50 Median :307.0 Median :1535 Median :37.73
## Mean : 50.50 Mean :307.1 Mean :1446 Mean :37.60
## 3rd Qu.: 75.25 3rd Qu.:307.0 3rd Qu.:1942 3rd Qu.:37.98
## Max. :100.00 Max. :308.0 Max. :2350 Max. :38.35
## activ
## Min. :0.00
## 1st Qu.:0.00
## Median :1.00
## Mean :0.62
## 3rd Qu.:1.00
## Max. :1.00
mean(beavers$day)
## [1] 307.13
mean(beavers$time)
## [1] 1446.2
median(beavers$temp)
## [1] 37.735
median(beavers$temp)
## [1] 37.735
- Create a new data frame with a subset of the columns and rows. Make sure to rename it.
beavers.subset<-beavers[c(1:10,20,30), c(1:5, 1)]
- Create new column names for the new data frame.
beavers.subset$SL.type<-ifelse(beavers.subset$temp>37,"High Temp","Low Temp")
- Use the summary function to create an overview of your new data frame. The print the mean and median for the same two attributes. Please compare.
summary(beavers.subset)
## X day time temp
## Min. : 1.00 Min. :307 Min. : 930.0 Min. :36.58
## 1st Qu.: 3.75 1st Qu.:307 1st Qu.: 987.5 1st Qu.:36.90
## Median : 6.50 Median :307 Median :1025.0 Median :37.01
## Mean : 8.75 Mean :307 Mean :1060.8 Mean :37.00
## 3rd Qu.: 9.25 3rd Qu.:307 3rd Qu.:1062.5 3rd Qu.:37.17
## Max. :30.00 Max. :307 Max. :1420.0 Max. :37.24
## activ X.1 SL.type
## Min. :0 Min. : 1.00 Length:12
## 1st Qu.:0 1st Qu.: 3.75 Class :character
## Median :0 Median : 6.50 Mode :character
## Mean :0 Mean : 8.75
## 3rd Qu.:0 3rd Qu.: 9.25
## Max. :0 Max. :30.00
mean(beavers.subset$temp)
## [1] 37.0025
mean(beavers.subset$time)
## [1] 1060.833
median(beavers.subset$temp)
## [1] 37.01
median(beavers.subset$time)
## [1] 1025
- For at least 3 values in a column please rename so that every value in that column is renamed. For example, suppose I have 20 values of the letter “e” in one column. Rename those values so that all 20 would show as “excellent”.
beavers.subset[beavers.subset$time=="evening","time"]<-"evening"
- Display enough rows to see examples of all of steps 1-5 above.
head(beavers.subset)
## X day time temp activ X.1 SL.type
## 1 1 307 930 36.58 0 1 Low Temp
## 2 2 307 940 36.73 0 2 Low Temp
## 3 3 307 950 36.93 0 3 Low Temp
## 4 4 307 1000 37.15 0 4 High Temp
## 5 5 307 1010 37.23 0 5 High Temp
## 6 6 307 1020 37.24 0 6 High Temp
- BONUS – place the original .csv in a github file and have R read from the link. This will be a very useful skill as you progress in your data science education and career.
# url<- "https://raw.githubusercontent.com/Sizzlo/Datasets/master/csv/boot/beaver.csv"
# beavers1<- read.csv(url)