Uploading Libraries
library(UsingR)
1.10
The three variables are Tree, age, and circumference.
colnames(Orange)
## [1] "Tree" "age" "circumference"
1.11
The average age of the trees in the Orange dataset is 922.1429 years
old.
mean(Orange$age)
## [1] 922.1429
1.12
The largest circumference of the trees is 214.
max(Orange$circumference)
## [1] 214
2.4
rep("a", times=5) #Sequence 1
## [1] "a" "a" "a" "a" "a"
seq(1, 100, by = 2) #Sequence 2
## [1] 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49
## [26] 51 53 55 57 59 61 63 65 67 69 71 73 75 77 79 81 83 85 87 89 91 93 95 97 99
rep(1:3, each = 3) #Sequence 3
## [1] 1 1 1 2 2 2 3 3 3
rep(1:3, times = c(3, 2, 1)) #Sequence 4
## [1] 1 1 1 2 2 3
c(1:5, 4:1) #Sequence 5
## [1] 1 2 3 4 5 4 3 2 1
2.20
The mean for the months containing 31 days is lower than the mean
for months not containing 31 days.
cd <- c(79, 74, 161, 127, 133, 210, 99, 143, 249, 249, 368, 302)
names(cd) <- c("Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec")
thirtyone <- cd[c(1, 3, 5, 7, 8, 10, 12)]
notthirtyone <- cd[c(2, 4, 6, 9, 11)]
mean(thirtyone)
## [1] 166.5714
mean(notthirtyone)
## [1] 205.6
2.21
In 1995, the amount dropped from the previous year. 1991 has the
biggest percentage increase.
salary <- c(0.57, 0.89, 1.08, 1.12, 1.18, 1.07, 1.17, 1.38, 1.44, 1.72)
names(salary) <- c(1990:1999)
diff(salary)
## 1991 1992 1993 1994 1995 1996 1997 1998 1999
## 0.32 0.19 0.04 0.06 -0.11 0.10 0.21 0.06 0.28
diff(salary)/salary[-length(salary)] * 100
## 1991 1992 1993 1994 1995 1996 1997 1998
## 56.140351 21.348315 3.703704 5.357143 -9.322034 9.345794 17.948718 4.347826
## 1999
## 19.444444
2.23
f <- function(x) mean(x^2) - mean(x)^2
f(1:10)
## [1] 8.25
2.42a
58.15% are less than 500 miles long.
sum(rivers<500)/length(rivers)
## [1] 0.5815603
2.42b
66.66% are less than the mean length.
sum(rivers<(mean(rivers)))/length(rivers)
## [1] 0.6666667
2.42c
The 75 quantile is 680.
summary(rivers)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 135.0 310.0 425.0 591.2 680.0 3710.0
2.47
zscore <- scale(rivers)
mean(zscore)
## [1] -5.006707e-17
sd(zscore)
## [1] 1
2.47 Histogram and Boxplot
The data is skewed to the right as it has a long tail to the right.
It is unimodal because there is only one peak. There are outliers, all
occurring above around 1,200. The furthest outlier occurs around
3700.
hist(rivers, freq = FALSE, xlab = "River Length(miles)", main = "Rivers Length Histogram" )
lines(density(rivers))

boxplot(rivers, main = "Boxplot of Rivers Dataset", col = "blue", horizontal = TRUE)

2.62
The summary function for factors returns a count of the occurences
for each level.
summary(Cars93$Cylinders)
## 3 4 5 6 8 rotary
## 3 49 2 31 7 1
2.64 Bargraph
barplot(table(Cars93$Cylinders),
main = "Distribution of Cylinders in Cars93 Dataset",
xlab = "Number of Cylinders", ylab = "Count",
col = "blue", border = "black")
