Location <- c("A", "A", "A", "B", "B", "B", "C")
Height <- c(100,200,300,450,600,800,1000)
Distance <- c(253,337,395,451,495,534,573)
a. Create a data frame called `Galileo` with the three variables and display the contents of the dataframe.
Galileo <- data.frame(Location, Height, Distance)
Galileo
b. Compute the mean, median, variance, and IQR of the variable `Distance` in the dataframe `Galileo`.
mean(Galileo$Distance)
## [1] 434
median(Galileo$Distance)
## [1] 451
var(Galileo$Distance)
## [1] 12837
IQR(Galileo$Distance)
## [1] 148.5
c. Create a variable for estimated distance $\mbox{D.Hat} = 200 + .708 \mbox{ Height} - .000344 \mbox{ Height}^2$ and add it to the data frame `Galileo`. Create a new variable `LO` that takes a value of `TRUE` when the estimated distance is lower than the measured distance ($\mbox{D.Hat} < \mbox{Distance}$) and a value of `FALSE` otherwise and add it to the data frame `Galileo`. Use the variable `LO` to extract a subset of the `Galileo` dataframe *removing* the observations for which the estimated distance is lower than the measured distance. Show the contents of this dataframe.
D.Hat=200+0.708*Galileo$Height-0.000344*(Galileo$Height)^2
Galileo<-data.frame(Galileo,D.Hat)
LO<-Galileo$D.Hat<Galileo$Distance
Galileo<-data.frame(Galileo,LO)
Galileo
Galileo[Galileo$LO=='FALSE',]
d. Plot `Distance` ($y$-axis) versus `Height` ($x$-axis) and overlay this plot with the curve of $\mbox{D.Hat} = 200 + .708 \mbox{ Height} - .000344 \mbox{ Height}^2$.
plot(Galileo$Height,Galileo$Distance)
curve((200+0.708*x-0.000344*(x)^2),col="purple",add=T)
hw4q2.csv. library(ggplot2)
seasons <- read.csv("hw4q2.csv", header = TRUE)
a. Read in the data and give a comparison boxplot of summer and winter relative humidity. Comment on your observations.
These box plots show that Winter humidity is capable of having higher and lower extremes of humidity comparatively to the Summer observations but trends towards having a higher humidity of about 3 degrees on average. Furthermore, the average minimum humidity during the winter is slightly above the average median humidity for the summer and the average median humidity for the winter is about the same as the average maximum humidity for the summer.
ggplot(
seasons,
aes(x=Season, y=Humidity, color=Season)
) +geom_boxplot()
b. Draw two normal quantile plots - one for summer humidity and another for winter humidity. Comment on your observations.
Both QQ plots show that summer and winter humidity follow the normal distribution with the winter humidity having a more concentrated and overall normal distribution than the summer observation.
summerqq <- seasons[c(1:29),c(1,2)]
qqnorm(summerqq$Humidity, main = "Summer Relative Humidity")
qqline(summerqq$Humidity)
winterqq <- seasons[c(30:55),c(1,2)]
qqnorm(winterqq$Humidity, main = "Winter Relative Humidity")
qqline(winterqq$Humidity)
c. Calculate the variances and IQRs for summer humidity and for winter humidity. Comment on your observations.
The variance for Summer and Winter humidity are highly similar with Winter variance being slightly higher than Summer variance.This could show that overall Winter temperatures vary slightly more than the overall Summer temperatures.
The IQR’s for Summer and Winter are also very similar and the Winter IQR is slightly lower than the Summer IQR. This could show that during the normal winter temperatures, there is slightly less change in temperature than the Summer normal temperatures.
var(summerqq$Humidity)
## [1] 18.37044
var(winterqq$Humidity)
## [1] 19.56086
IQR(summerqq$Humidity)
## [1] 4.9
IQR(winterqq$Humidity)
## [1] 4.45
Historically, reinforced concrete structures used externally bonded steel plates to add strength and support. Recently, fiber reinforced polymer (FRP) plates have been used instead of steel plates because of their superior properties. Investigators developed a method to mathematically model bond strength between a carbon FRP and a concrete substrate. For each of 15 carbon FRP–concrete samples, they reported the maximum transferable load (kN) calculated by the model (Calc) and compared this with the corresponding maximum transferable load (kN) as measured in the laboratory (Meas). The data are given here:
conc <- read.csv("hw4q3.csv", header = TRUE)
The measured load and the calculated load are relatively close in value. When one variable is low, the other is also low. The same applies for when either variable is high, the other is high as well.
plot(conc$Calc, conc$Meas)
The correlation coefficient calculation confirms there is a very strong positive relationship between these variables.
cor(conc$Calc, conc$Meas)
## [1] 0.9030048