The reading material for this exercise is the second part of “Introduction to R” (ch 7-12), found from the R home page or https://cran.r-project.org/doc/manuals/r-release/R-intro.html
Upload your answer to the LearnWeb assigned as a zip file with two files in it: (i) an R markdown file and (ii) an HTML file that resulted from knitting the R Markdown file. (If this is problematic, use PDF or something else.)
Create CSV file, read it and assign it to an object called test
#Load the library for reading files
library(readr)
#Initial file path. All the path should be written down in case it is not contained in the working directory
#test <- read_csv("Master/Europa/Clases Alemania/R/Scripts/programmingds/excersise 2/test.csv")
#Case when uncompress the file is in the working directory
test <- read_csv("test.csv")## Parsed with column specification:
## cols(
## width = col_double(),
## lenght = col_double()
## )
test## # A tibble: 4 x 2
## width lenght
## <dbl> <dbl>
## 1 10 15
## 2 12 27
## 3 11 20
## 4 13 22
#If is needed to create the CSV from R, along with a data.frame the code is:
csvdata<-data.frame(width=c(10,12,11,13),lenght=c(15,27,20,22))
write.csv(csvdata,'testData.csv')
#CSV generated
list.files(getwd(), pattern = "\\.csv$")## [1] "test.csv" "testData.csv"
Create the same object using data.frame function
test<-data.frame(width=c(10,12,11,13),lenght=c(15,27,20,22))
test## width lenght
## 1 10 15
## 2 12 27
## 3 11 20
## 4 13 22
Carry out a linear regression of width (y) against length (x), and interpret the results
linear_regression <- lm(width ~ lenght, data = test)
summary(linear_regression)##
## Call:
## lm(formula = width ~ lenght, data = test)
##
## Residuals:
## 1 2 3 4
## -0.3649 -0.6351 -0.3108 1.3108
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 7.5270 2.7019 2.786 0.108
## lenght 0.1892 0.1260 1.501 0.272
##
## Residual standard error: 1.084 on 2 degrees of freedom
## Multiple R-squared: 0.5297, Adjusted R-squared: 0.2946
## F-statistic: 2.253 on 1 and 2 DF, p-value: 0.2722
By analyzing the summary results we have a low variance among the data, a low t-value indicates there is a low quantity of extreme values, the p-value is close to zero. Therefore a linear regression model can be considered explanatory for most of the observed data.
Add a factor variable f to test, with levels (‘a’,‘a’,‘b’,“b”) and, run a linear regression of width (y) against length and f(x), and interpret the results.
test$f<-as.factor(c("a","a","b","b"))
factor_linear_regression <- lm(width ~ lenght+f, data = test)
summary(factor_linear_regression)##
## Call:
## lm(formula = width ~ lenght + f, data = test)
##
## Residuals:
## 1 2 3 4
## 0.1351 -0.1351 -0.8108 0.8108
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 7.0270 2.9545 2.378 0.253
## lenght 0.1892 0.1351 1.400 0.395
## fb 1.0000 1.1625 0.860 0.548
##
## Residual standard error: 1.162 on 1 degrees of freedom
## Multiple R-squared: 0.7297, Adjusted R-squared: 0.1892
## F-statistic: 1.35 on 2 and 1 DF, p-value: 0.5199
Adding a new variable to the model helped to lower the extreme values and added 0.0091 more variations to the data. By analyzing other summary results, we have a low variance among the data. The new p-value is higher than before but close to zero. Therefore a linear regression model can be considered explanatory for most of the observed data. Furthermore, the change in the p-value has to be considered. Adding this variable did not help to explain better the model.
Create a data vector with 500 random values, drawn from a t distribution with 5 degrees of freedom (and zero non-centrality).
randomStud<-rt(n=500,df=5,ncp = 0)
randomStud## [1] -0.446172002 -0.832210972 -0.238654115 -0.292400782 -0.719907338
## [6] 1.187360665 0.520285190 1.343289760 -0.334257090 0.693106224
## [11] -0.801352496 -0.542063232 -0.206985988 0.142618355 -0.604040100
## [16] -0.541663652 -0.622847411 -0.055890752 2.546330587 4.942763839
## [21] -2.249869146 -1.333324102 -0.380061995 -0.682321032 -0.295052805
## [26] -2.155713504 1.386036338 1.728168033 1.073590779 0.463805434
## [31] 1.263926973 -0.227280444 -0.854069549 0.315663041 -0.757626302
## [36] -0.325817109 0.582423723 0.078960559 -3.243375822 0.755811895
## [41] 0.534601740 -1.201254015 0.178204936 -0.290225784 -0.664003209
## [46] 0.276976785 0.459618395 1.185225413 0.018805389 -1.371068303
## [51] -2.057924273 -0.062644999 0.067235569 0.624042604 -0.090323996
## [56] 0.958399585 -0.993787438 0.139462877 -0.269449896 -0.010088366
## [61] -1.229149850 1.867872721 0.628114350 -0.853746643 -1.127145929
## [66] -1.326249977 0.320579607 1.438787791 2.142303367 1.568732710
## [71] -0.848702773 -0.341406605 0.818295923 1.042887067 1.818370790
## [76] -0.081909554 0.590649719 -0.746488828 -1.091950307 1.743147615
## [81] 2.624133994 0.033992280 1.393758071 -0.858769229 0.716437004
## [86] -2.776190993 0.379130497 -1.599811578 -1.910738442 1.039240564
## [91] 0.561231673 0.007357328 -0.614448183 1.868040785 0.033635562
## [96] -0.089799201 0.262218311 0.314570320 1.309489290 -0.136052824
## [101] -0.433604758 -0.704576398 2.338502408 2.873012817 -1.176821643
## [106] -0.648714336 0.586165358 -2.064487908 0.238346988 0.315185587
## [111] -2.222120707 0.751105435 0.561058287 0.609516650 -0.583004450
## [116] -0.011825802 1.054110462 -0.489389832 0.230115210 0.519631526
## [121] -0.364565061 -2.616856326 0.868762160 -0.860783226 1.286789951
## [126] -1.694189439 0.139271212 -1.324376007 -2.414430612 -0.050750655
## [131] -0.235572900 -0.237226567 0.362271387 -0.723771511 1.058237307
## [136] -0.181233609 3.455738732 1.050583893 -0.236630502 -0.653758583
## [141] 0.223406916 1.701031104 0.393149359 -0.430337696 1.022541183
## [146] 0.589418211 -0.091658572 2.180441407 0.866536899 -0.003608257
## [151] -0.028108436 1.773310270 0.395934997 1.139808468 -0.123943369
## [156] -0.107629274 -0.673460484 1.591828351 0.062713702 -2.606405904
## [161] -2.501744681 0.006553414 -0.421764301 1.131613455 0.144666808
## [166] 1.584472037 -0.284259837 -2.715481535 0.942901853 -2.459147114
## [171] -1.322394879 0.833426283 3.175881800 -1.670131906 0.716970379
## [176] -1.555806121 0.432716594 -1.152619923 -1.495940430 -0.320329979
## [181] 1.450514607 -0.994596734 0.601665374 1.573924012 0.499848519
## [186] 1.902155613 1.467414858 0.112651688 -0.246050814 0.278607979
## [191] 0.447183384 -0.656594225 2.117351611 1.377412440 -0.797128802
## [196] -4.291161152 -0.234477588 -1.124706702 1.250651802 -1.783934869
## [201] 0.559229016 0.229437892 -0.493251551 -1.496740355 -1.904135112
## [206] -1.268384457 -0.768158359 -0.576066648 -0.411019370 3.220598668
## [211] -0.108578721 -1.273582049 0.026044778 -0.488769858 -0.517051434
## [216] 0.140754000 -0.254733355 0.727353229 -1.617866563 -2.056315215
## [221] -1.355351511 -0.713849010 0.215673126 -0.105072507 -0.484334544
## [226] 9.107029107 2.821689406 -0.650340576 -1.826636004 1.919023710
## [231] -0.487145626 -0.606778308 -1.213119484 -0.172025565 -1.392872834
## [236] 0.578449679 -1.513050690 -1.074107795 -2.357000007 -0.515401667
## [241] 0.423064553 -0.190536474 0.461946046 -0.464990612 -0.543028331
## [246] -0.280983392 0.073895663 2.392471252 1.531697922 0.763851522
## [251] 0.603991492 -0.570769410 0.482812224 -1.522113728 0.948941796
## [256] 1.558674923 1.125844559 0.028210333 0.357148702 0.734183765
## [261] 0.146449666 -0.583983601 0.499835878 1.202999083 0.414270820
## [266] -3.767118279 0.443255815 -1.218992196 -0.996794878 -0.125854580
## [271] -0.461821225 1.374935072 -1.575101695 0.497955064 -0.180498728
## [276] 1.246712177 -0.476019722 -0.281880613 -0.245818743 0.534429349
## [281] 0.318144251 0.032612104 1.334123350 -1.145008096 0.228157550
## [286] -1.589315121 0.141839689 0.986879259 -0.774942542 0.581342246
## [291] -1.395051157 0.686168281 0.226932714 -0.617877768 -0.881337537
## [296] -1.102403435 -0.109641861 0.039100261 0.089943698 -1.026527272
## [301] 1.266480030 -0.796983049 -0.785194175 0.461483885 0.570228374
## [306] 0.629598106 -0.331670251 0.762503738 -0.526901934 0.245221837
## [311] -0.051684301 -0.329556526 2.397765329 -0.180110921 0.234102135
## [316] 0.037683219 -2.679398473 1.913611475 -1.195571274 0.491345165
## [321] 0.880082253 1.781834167 -0.940602581 1.819950524 0.904469237
## [326] -0.239908610 -0.576104650 2.359280258 -0.285350557 1.026363647
## [331] -1.535996656 -2.243729793 -1.352817613 -2.915071132 -1.709290797
## [336] 1.948671859 0.206919579 -0.337449033 0.041417652 0.668925524
## [341] -1.914662695 -0.244244210 1.003245079 1.479054794 -1.880759664
## [346] -0.544580254 0.507815351 0.883176998 1.363575350 -0.730886283
## [351] -0.765174806 -1.101567866 0.485372786 -0.240013673 -0.819805769
## [356] 0.392639322 -0.416765734 -2.048772641 -0.318526949 -2.142089006
## [361] -0.020854926 0.873376812 0.772790529 0.348077029 3.862625134
## [366] 1.257221439 -0.529383074 -1.063419402 0.515437456 1.782805630
## [371] 0.430500860 0.262637489 0.522890862 0.317522209 -1.232202657
## [376] -1.066384062 -1.093208030 -1.498624093 -1.285717067 -1.276561981
## [381] -1.523046489 1.731054167 -0.109305331 -0.948868184 0.210051203
## [386] 2.049141701 0.900251530 -0.011297709 0.505753378 -0.540601854
## [391] 0.913397610 -0.676174751 2.312509132 -0.516089532 -0.129473809
## [396] 2.101921853 0.848721682 -0.535413233 -0.613332772 -0.813932318
## [401] -0.202212908 -0.748298268 0.966398979 0.034742348 -1.988654020
## [406] 1.555412077 0.536129115 -0.270680932 0.864024004 0.537519938
## [411] -0.611890775 -2.070970436 -0.974395604 0.590979897 -0.206309582
## [416] -1.035281543 2.000533309 1.462051895 0.331907660 -1.369019076
## [421] 1.165907902 0.578608975 0.383445094 -0.107601100 -0.444863683
## [426] 0.008104674 -0.155069271 0.986496294 -1.730717194 -0.834055855
## [431] -0.555601758 0.982370611 -0.922185381 -0.061381950 0.860387971
## [436] -0.741851297 -1.145583733 -0.112360599 0.787228394 -1.000826894
## [441] -1.919501661 0.387285738 -0.582095048 -0.751372140 -1.031280134
## [446] 0.486658943 -1.087675728 -0.416604216 1.053052814 2.028932718
## [451] -2.393096438 -0.915238848 -0.005926656 -0.442497509 -1.193386922
## [456] -0.189464164 0.513850989 -2.126437726 0.208915556 -1.204862814
## [461] 0.336688235 -0.753272436 -1.335921372 -0.041504864 -0.826599356
## [466] 0.387558018 3.965579532 -2.292198012 -0.778931656 -0.337316483
## [471] 1.811833934 -1.271809082 -1.656157172 -0.769990991 0.981296732
## [476] 0.585208526 -1.875711528 -0.407723074 -3.054343672 0.512383545
## [481] 2.457582470 0.725808069 -1.428239861 1.845291575 0.559408469
## [486] -0.601037120 -1.392051061 0.012198394 0.771992244 -0.413890110
## [491] -0.970824744 0.375396511 -0.758025163 0.145685368 1.170549211
## [496] 2.063390659 0.947148577 0.168465815 0.389380122 -0.952236465
Plot this vector with qqnorm, add the reference line with qqline and interpret the resulting distribution in the context of the normal distribution.
qqnorm(randomStud, main=paste("Normal QQ Plot for random t Students sample with ",5," df & n=",500), col="#3399CC", pch=20)
qqline(randomStud,col="red")The plot displays the data along with the qqline indicating a low variation in general but also more extreme values than expected in the context of normal distribution at the extremities.
Write a function called plot t that combines these steps: it should accept two parameters, n (with default value 500) and df (with default value 5), and create the qqnorm plot with the added line. Demonstrate the function with default parameter values, and with parameter values 1000 and 10.
#Function definition
plot_t<-function(n=500, df=5){
data<-rt(n,df,ncp = 0)
plot<-qqnorm(randomStud, main=paste("Normal QQ Plot for random t Students sample with ",df," df & n=",n),col="orange", pch=15)
qqline(randomStud,col="red", add=TRUE)
}
#Function demostration : default values
plot_t()#Function demostration : values 1000 and 10
plot_t(1000,10)Run this function, while trying to capture the output in a png file with size 800 pixels x 800 pixels.
#Create output file
timeshoot=format(Sys.time(), "%y-%m-%d_%H.%M")
png(paste("vsosa_fplot_",timeshoot, ".png", sep = ""), width = 800, height = 800)
#Execute function to get the plot
plot_t(1000,10)
#Closing the output file
dev.off()## png
## 2
#Plot generated:
list.files(getwd(), pattern = "\\.png$")## [1] "Rplot.png" "Rplot01.png"
## [3] "vsosa_fplot_19-05-17_07.47.png"
Allaire, JJ, Yihui Xie, Jonathan McPherson, Javier Luraschi, Kevin Ushey, Aron Atkins, Hadley Wickham, Joe Cheng, Winston Chang, and Richard Iannone. 2019. Rmarkdown: Dynamic Documents for R. https://rmarkdown.rstudio.com
R Documentation https://www.rdocumentation.org/
Robert I. Kabacoff. Quick R by DataCamp . https://www.statmethods.net/index.html
Yihui Xie, J. J. Allaire, Garrett Grolemund. 2019. R Markdown: The Definitive Guide. https://bookdown.org/yihui/rmarkdown/
Zieffler Andrew. R Markdown Theme Gallery http://www.datadreaming.org/post/r-markdown-theme-gallery/