Question 1
MAT1 = matrix(c(1,2,3,2,5,6,3,6,9),nrow = 3, ncol = 3, byrow = FALSE) MAT2 = matrix(c(1,0,0,0,1,0,0,0,1),nrow = 3, ncol = 3, byrow = FALSE) MAT1 + t(MAT1) MAT1 - t(MAT1) MAT1 * MAT2 (t(MAT1) + MAT2)*(MAT1 + MAT2)^-1
Question 2 The first and second models are linear models. This is because neither of them have betas as exponents. In addition, neither have Betas as logs.
Question 3 B0 = -.34 B1 = 1.04 X1 = seq(0, 0+3, by=.001) e = exp(1) LOG1 = log(B1)*X1
plot(B0 + B1X1^2) plot(B1e^X1) plot(B0 + e^(B1*X1)) plot(B0+e^LOG1)
Question 4 Thinking: The X matrix, unless I am misunderstanding is representing a linear model, whereas the Y vector is representing the actual results. In addition, the Y vector is transverse, so it need to be shifted to be properly represented on a graph. Method: I will plug in a few different values for the slope, and see which one of them has the smallest squared error. This is the one which I will declare the best. I will use the arbitrary values of 0, -.5, -1.0, -1.5. In addition, I will use the value of 7 as my y-intercept. Disclaimer: I would definitely be possible to achieve more granular and precise data, either with more work/time/effort or with a computer/calculator. Actual Math: slope is 0 (6.6 - 0(2))^2 = 43.56 (2.2 - 0(5))^2 = 4.84 (-1.1 - 0(6))^2 = 1.1 43.56 + 4.84 + 1.21 = 49.61 slope is -.5 (6.6 - .5(2))^2 = 31.36 (2.2 - .5(5))^2 = .09 (-1.1 - .5(6))^2 = 16.81 31.36 + .09 + 16.81 = 48.26 slope is -1 (6.6 - 1(2))^2 = 21.16 (2.2 - 1(5))^2 = 7.84 (-1.1 - 1(6))^2 = 50.41 21.16 + 7.84 + 50.41 = 79.41 slope is -1.5 (6.6 - 1.5(2))^2 = 12.96 (2.2 - 1.5(5))^2 = 28.09 (-1.1 - 1.5(6))^2 = 102.01 102.01 + 28.09 + 12.96 = 143.06 Conclusion: The slope, or beta value, is is most accurate to a line of best fit when it is between 0 and -.5
Question 5 The data was collected in the Mauna Loa observatory in Hawaii. This data collection started in March of 1958. The study consisted of the observation of atmospheric Carbon Dioxide in the area. In 1974 NOAA also started to monitor the CO2 levels in the area. The data is consists both of a non-smoothed portion, and also of a portion which is smoothed by correcting the data for seasonal cyclicality.
Question 6 url <- “ftp://aftp.cmdl.noaa.gov/products/trends/co2/co2_annmean_mlo.txt” DFCO2 <- read.table(url,header=FALSE) colnames(DFCO2) <- c(“year”,“meanCO2”,“unc”)
A:
CO2LM = lm(DFCO2\(meanCO2~DFCO2\)year)
summary(CO2LM)
The intercept is -2.855 The slope is 1.614
B:
Y2 = DFCO2\(meanCO2 X2 = DFCO2\)year X3 = matrix(, nrow = 63, ncol = 3) X3[,1] = 1 X3[,2] = seq(1959, 1959+62, by=1) X3[,3] = seq(1959, 1959+62, by=1)^2
Y2MEAN = mean(DFCO2$meanCO2) ABSDIS = sum(abs(Y2-Y2MEAN)) SQRDIS = sum((Y2 -Y2MEAN)^2)
C:
plot(x = DFCO2\(year,y = DFCO2\)meanCO2, xlab = “Year”, ylab = expression(“Mean annual”CO[2]" concentration“), main=”CO2 each year“,col=”blue",abline(CO2LM))
I was only able to figure out how to produce a single line of best fit. I did this using a linear regression of the data that I did for part A of this same question. I fear I may be misunderstanding the instructions of this question as a whole
D:
y = 315.98 + 1.614(2050 - 1959)
CO2 concentration in 2050 will be 462.854 according to the linear regression.
E: I believe that my models are an example of the use of probalistic thinking. This is because the intercept and slope for my model was derived using linear regression. This use of linear regression helps to incorporate error terms and the possiblility of “fuzziness” into the data.
Question 7 Model: y = 315.98 + 1.614(2050-1959) Explanation: 315.98 was the starting CO2 concentration in 1959. The linear regression estimated that each year the CO2 emmsions rose by an average of 1.614. The final portion of the equation is just to give the years since 1959 for which year we are predicting the CO2 concentration of.