For your assignment you may be using different dataset than what is included here.
Always read carefully the instructions on Sakai.
Tasks/questions to be completed/answered are highlighted in larger bolded fonts and numbered according to their section.
In a given year, if it rains more, we may see that there might be an increase in crop production. This is because more water may lead to more plants.
This is a direct relationship; the number of fruits may be able to be predicted by amount of waterfall in a certain year.
This example represents simple linear regression, which is an extremely useful concept that allows us to predict values of a certain variable based off another variable.
This lab will explore the concepts of simple linear regression, multiple linear regression, and watson analytics.
We are going to use tidyverse a collection of R packages designed for data science.
Loading required package: tidyverse
[30m-- [1mAttaching packages[22m --------------------------------------- tidyverse 1.2.1 --[39m
[30m[32mv[30m [34mggplot2[30m 2.2.1 [32mv[30m [34mpurrr [30m 0.2.4
[32mv[30m [34mtibble [30m 1.4.2 [32mv[30m [34mdplyr [30m 0.7.4
[32mv[30m [34mtidyr [30m 0.8.0 [32mv[30m [34mstringr[30m 1.2.0
[32mv[30m [34mreadr [30m 1.1.1 [32mv[30m [34mforcats[30m 0.2.0[39m
[30m-- [1mConflicts[22m ------------------------------------------ tidyverse_conflicts() --
[31mx[30m [34mdplyr[30m::[32mfilter()[30m masks [34mstats[30m::filter()
[31mx[30m [34mdplyr[30m::[32mlag()[30m masks [34mstats[30m::lag()[39m
Loading required package: plotly
there is no package called <U+393C><U+3E31>plotly<U+393C><U+3E32>Installing package into <U+393C><U+3E31>C:/Users/fasha/OneDrive/Documents/R/win-library/3.4<U+393C><U+3E32>
(as <U+393C><U+3E31>lib<U+393C><U+3E32> is unspecified)
also installing the dependencies <U+393C><U+3E31>modeltools<U+393C><U+3E32>, <U+393C><U+3E31>DEoptimR<U+393C><U+3E32>, <U+393C><U+3E31>prettyunits<U+393C><U+3E32>, <U+393C><U+3E31>mclust<U+393C><U+3E32>, <U+393C><U+3E31>flexmix<U+393C><U+3E32>, <U+393C><U+3E31>prabclus<U+393C><U+3E32>, <U+393C><U+3E31>diptest<U+393C><U+3E32>, <U+393C><U+3E31>mvtnorm<U+393C><U+3E32>, <U+393C><U+3E31>robustbase<U+393C><U+3E32>, <U+393C><U+3E31>kernlab<U+393C><U+3E32>, <U+393C><U+3E31>trimcluster<U+393C><U+3E32>, <U+393C><U+3E31>udunits2<U+393C><U+3E32>, <U+393C><U+3E31>e1071<U+393C><U+3E32>, <U+393C><U+3E31>subprocess<U+393C><U+3E32>, <U+393C><U+3E31>semver<U+393C><U+3E32>, <U+393C><U+3E31>rappdirs<U+393C><U+3E32>, <U+393C><U+3E31>progress<U+393C><U+3E32>, <U+393C><U+3E31>reshape<U+393C><U+3E32>, <U+393C><U+3E31>memoise<U+393C><U+3E32>, <U+393C><U+3E31>git2r<U+393C><U+3E32>, <U+393C><U+3E31>httpuv<U+393C><U+3E32>, <U+393C><U+3E31>xtable<U+393C><U+3E32>, <U+393C><U+3E31>sourcetools<U+393C><U+3E32>, <U+393C><U+3E31>processx<U+393C><U+3E32>, <U+393C><U+3E31>fpc<U+393C><U+3E32>, <U+393C><U+3E31>viridis<U+393C><U+3E32>, <U+393C><U+3E31>units<U+393C><U+3E32>, <U+393C><U+3E31>classInt<U+393C><U+3E32>, <U+393C><U+3E31>XML<U+393C><U+3E32>, <U+393C><U+3E31>wdman<U+393C><U+3E32>, <U+393C><U+3E31>binman<U+393C><U+3E32>, <U+393C><U+3E31>repr<U+393C><U+3E32>, <U+393C><U+3E31>htmlwidgets<U+393C><U+3E32>, <U+393C><U+3E31>hexbin<U+393C><U+3E32>, <U+393C><U+3E31>crosstalk<U+393C><U+3E32>, <U+393C><U+3E31>data.table<U+393C><U+3E32>, <U+393C><U+3E31>maps<U+393C><U+3E32>, <U+393C><U+3E31>ggthemes<U+393C><U+3E32>, <U+393C><U+3E31>GGally<U+393C><U+3E32>, <U+393C><U+3E31>devtools<U+393C><U+3E32>, <U+393C><U+3E31>shiny<U+393C><U+3E32>, <U+393C><U+3E31>Rserve<U+393C><U+3E32>, <U+393C><U+3E31>RSclient<U+393C><U+3E32>, <U+393C><U+3E31>Cairo<U+393C><U+3E32>, <U+393C><U+3E31>webshot<U+393C><U+3E32>, <U+393C><U+3E31>listviewer<U+393C><U+3E32>, <U+393C><U+3E31>dendextend<U+393C><U+3E32>, <U+393C><U+3E31>sf<U+393C><U+3E32>, <U+393C><U+3E31>RSelenium<U+393C><U+3E32>, <U+393C><U+3E31>png<U+393C><U+3E32>, <U+393C><U+3E31>IRdisplay<U+393C><U+3E32>
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/modeltools_0.2-21.zip'
Content type 'application/zip' length 138817 bytes (135 KB)
downloaded 135 KB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/DEoptimR_1.0-8.zip'
Content type 'application/zip' length 41956 bytes (40 KB)
downloaded 40 KB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/prettyunits_1.0.2.zip'
Content type 'application/zip' length 27450 bytes (26 KB)
downloaded 26 KB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/mclust_5.4.zip'
Content type 'application/zip' length 4128594 bytes (3.9 MB)
downloaded 3.9 MB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/flexmix_2.3-14.zip'
Content type 'application/zip' length 1418984 bytes (1.4 MB)
downloaded 1.4 MB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/prabclus_2.2-6.zip'
Content type 'application/zip' length 280371 bytes (273 KB)
downloaded 273 KB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/diptest_0.75-7.zip'
Content type 'application/zip' length 355332 bytes (347 KB)
downloaded 347 KB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/mvtnorm_1.0-7.zip'
Content type 'application/zip' length 233555 bytes (228 KB)
downloaded 228 KB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/robustbase_0.92-8.zip'
Content type 'application/zip' length 3373315 bytes (3.2 MB)
downloaded 3.2 MB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/kernlab_0.9-25.zip'
Content type 'application/zip' length 2218659 bytes (2.1 MB)
downloaded 2.1 MB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/trimcluster_0.1-2.zip'
Content type 'application/zip' length 16170 bytes (15 KB)
downloaded 15 KB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/udunits2_0.13.zip'
Content type 'application/zip' length 277648 bytes (271 KB)
downloaded 271 KB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/e1071_1.6-8.zip'
Content type 'application/zip' length 895338 bytes (874 KB)
downloaded 874 KB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/subprocess_0.8.2.zip'
Content type 'application/zip' length 513586 bytes (501 KB)
downloaded 501 KB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/semver_0.2.0.zip'
Content type 'application/zip' length 618390 bytes (603 KB)
downloaded 603 KB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/rappdirs_0.3.1.zip'
Content type 'application/zip' length 82922 bytes (80 KB)
downloaded 80 KB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/progress_1.1.2.zip'
Content type 'application/zip' length 42555 bytes (41 KB)
downloaded 41 KB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/reshape_0.8.7.zip'
Content type 'application/zip' length 128195 bytes (125 KB)
downloaded 125 KB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/memoise_1.1.0.zip'
Content type 'application/zip' length 29930 bytes (29 KB)
downloaded 29 KB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/git2r_0.21.0.zip'
Content type 'application/zip' length 3028572 bytes (2.9 MB)
downloaded 2.9 MB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/httpuv_1.3.6.2.zip'
Content type 'application/zip' length 930542 bytes (908 KB)
downloaded 908 KB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/xtable_1.8-2.zip'
Content type 'application/zip' length 710221 bytes (693 KB)
downloaded 693 KB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/sourcetools_0.1.6.zip'
Content type 'application/zip' length 528078 bytes (515 KB)
downloaded 515 KB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/processx_2.0.0.1.zip'
Content type 'application/zip' length 91610 bytes (89 KB)
downloaded 89 KB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/fpc_2.1-11.zip'
Content type 'application/zip' length 458239 bytes (447 KB)
downloaded 447 KB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/viridis_0.5.0.zip'
Content type 'application/zip' length 1714253 bytes (1.6 MB)
downloaded 1.6 MB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/units_0.5-1.zip'
Content type 'application/zip' length 872436 bytes (851 KB)
downloaded 851 KB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/classInt_0.1-24.zip'
Content type 'application/zip' length 60081 bytes (58 KB)
downloaded 58 KB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/XML_3.98-1.10.zip'
Content type 'application/zip' length 4325149 bytes (4.1 MB)
downloaded 4.1 MB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/wdman_0.2.2.zip'
Content type 'application/zip' length 54591 bytes (53 KB)
downloaded 53 KB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/binman_0.1.0.zip'
Content type 'application/zip' length 83672 bytes (81 KB)
downloaded 81 KB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/repr_0.12.0.zip'
Content type 'application/zip' length 61241 bytes (59 KB)
downloaded 59 KB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/htmlwidgets_1.0.zip'
Content type 'application/zip' length 852738 bytes (832 KB)
downloaded 832 KB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/hexbin_1.27.2.zip'
Content type 'application/zip' length 684884 bytes (668 KB)
downloaded 668 KB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/crosstalk_1.0.0.zip'
Content type 'application/zip' length 599121 bytes (585 KB)
downloaded 585 KB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/data.table_1.10.4-3.zip'
Content type 'application/zip' length 1577087 bytes (1.5 MB)
downloaded 1.5 MB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/maps_3.2.0.zip'
Content type 'application/zip' length 3631730 bytes (3.5 MB)
downloaded 3.5 MB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/ggthemes_3.4.0.zip'
Content type 'application/zip' length 910204 bytes (888 KB)
downloaded 888 KB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/GGally_1.3.2.zip'
Content type 'application/zip' length 1243907 bytes (1.2 MB)
downloaded 1.2 MB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/devtools_1.13.5.zip'
Content type 'application/zip' length 443954 bytes (433 KB)
downloaded 433 KB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/shiny_1.0.5.zip'
Content type 'application/zip' length 2835352 bytes (2.7 MB)
downloaded 2.7 MB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/Rserve_1.7-3.zip'
Content type 'application/zip' length 632080 bytes (617 KB)
downloaded 617 KB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/RSclient_0.7-3.zip'
Content type 'application/zip' length 1292026 bytes (1.2 MB)
downloaded 1.2 MB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/Cairo_1.5-9.zip'
Content type 'application/zip' length 1031084 bytes (1006 KB)
downloaded 1006 KB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/webshot_0.5.0.zip'
Content type 'application/zip' length 1353612 bytes (1.3 MB)
downloaded 1.3 MB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/listviewer_1.4.0.zip'
Content type 'application/zip' length 243425 bytes (237 KB)
downloaded 237 KB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/dendextend_1.7.0.zip'
Content type 'application/zip' length 1853859 bytes (1.8 MB)
downloaded 1.8 MB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/sf_0.6-0.zip'
Content type 'application/zip' length 36277805 bytes (34.6 MB)
downloaded 34.6 MB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/RSelenium_1.7.1.zip'
Content type 'application/zip' length 1887767 bytes (1.8 MB)
downloaded 1.8 MB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/png_0.1-7.zip'
Content type 'application/zip' length 291037 bytes (284 KB)
downloaded 284 KB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/IRdisplay_0.4.4.zip'
Content type 'application/zip' length 24378 bytes (23 KB)
downloaded 23 KB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/plotly_4.7.1.zip'
Content type 'application/zip' length 1160742 bytes (1.1 MB)
downloaded 1.1 MB
package modeltools successfully unpacked and MD5 sums checked
package DEoptimR successfully unpacked and MD5 sums checked
package prettyunits successfully unpacked and MD5 sums checked
package mclust successfully unpacked and MD5 sums checked
package flexmix successfully unpacked and MD5 sums checked
package prabclus successfully unpacked and MD5 sums checked
package diptest successfully unpacked and MD5 sums checked
package mvtnorm successfully unpacked and MD5 sums checked
package robustbase successfully unpacked and MD5 sums checked
package kernlab successfully unpacked and MD5 sums checked
package trimcluster successfully unpacked and MD5 sums checked
package udunits2 successfully unpacked and MD5 sums checked
package e1071 successfully unpacked and MD5 sums checked
package subprocess successfully unpacked and MD5 sums checked
package semver successfully unpacked and MD5 sums checked
package rappdirs successfully unpacked and MD5 sums checked
package progress successfully unpacked and MD5 sums checked
package reshape successfully unpacked and MD5 sums checked
package memoise successfully unpacked and MD5 sums checked
package git2r successfully unpacked and MD5 sums checked
package httpuv successfully unpacked and MD5 sums checked
package xtable successfully unpacked and MD5 sums checked
package sourcetools successfully unpacked and MD5 sums checked
package processx successfully unpacked and MD5 sums checked
package fpc successfully unpacked and MD5 sums checked
package viridis successfully unpacked and MD5 sums checked
package units successfully unpacked and MD5 sums checked
package classInt successfully unpacked and MD5 sums checked
package XML successfully unpacked and MD5 sums checked
package wdman successfully unpacked and MD5 sums checked
package binman successfully unpacked and MD5 sums checked
package repr successfully unpacked and MD5 sums checked
package htmlwidgets successfully unpacked and MD5 sums checked
package hexbin successfully unpacked and MD5 sums checked
package crosstalk successfully unpacked and MD5 sums checked
package data.table successfully unpacked and MD5 sums checked
package maps successfully unpacked and MD5 sums checked
package ggthemes successfully unpacked and MD5 sums checked
package GGally successfully unpacked and MD5 sums checked
package devtools successfully unpacked and MD5 sums checked
package shiny successfully unpacked and MD5 sums checked
package Rserve successfully unpacked and MD5 sums checked
package RSclient successfully unpacked and MD5 sums checked
package Cairo successfully unpacked and MD5 sums checked
package webshot successfully unpacked and MD5 sums checked
package listviewer successfully unpacked and MD5 sums checked
package dendextend successfully unpacked and MD5 sums checked
package sf successfully unpacked and MD5 sums checked
package RSelenium successfully unpacked and MD5 sums checked
package png successfully unpacked and MD5 sums checked
package IRdisplay successfully unpacked and MD5 sums checked
package plotly successfully unpacked and MD5 sums checked
The downloaded binary packages are in
C:\Users\fasha\AppData\Local\Temp\Rtmp0oaX4s\downloaded_packages
Attaching package: <U+393C><U+3E31>plotly<U+393C><U+3E32>
The following object is masked from <U+393C><U+3E31>package:ggplot2<U+393C><U+3E32>:
last_plot
The following object is masked from <U+393C><U+3E31>package:stats<U+393C><U+3E32>:
filter
The following object is masked from <U+393C><U+3E31>package:graphics<U+393C><U+3E32>:
layout
Name your dataset ‘mydata’ so it easy to work with.
Commands: read_csv() rename() head()
mydata <- read.csv(file="data/Advertising.csv")
mydata
sales <- mydata$sales
newspaper <- mydata$newspaper
radio <- mydata$radio
tv <- mydata$TV
mydata <- rename(mydata, "case_number" = "X")
head(mydata)
corr = cor(mydata[ -c(1)] )
corr
TV radio newspaper sales
TV 1.00000000 0.05480866 0.05664787 0.7822244
radio 0.05480866 1.00000000 0.35410375 0.5762226
newspaper 0.05664787 0.35410375 1.00000000 0.2282990
sales 0.78222442 0.57622257 0.22829903 1.0000000
#Simple Linear Regression Model
reg <- lm( sales ~ radio )
reg
Call:
lm(formula = sales ~ radio)
Coefficients:
(Intercept) radio
9.3116 0.2025
#Summary of Simple Linear Regression Model
summary(reg)
Call:
lm(formula = sales ~ radio)
Residuals:
Min 1Q Median 3Q Max
-15.7305 -2.1324 0.7707 2.7775 8.1810
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 9.31164 0.56290 16.542 <2e-16 ***
radio 0.20250 0.02041 9.921 <2e-16 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 4.275 on 198 degrees of freedom
Multiple R-squared: 0.332, Adjusted R-squared: 0.3287
F-statistic: 98.42 on 1 and 198 DF, p-value: < 2.2e-16
# The r-squared value is .332 and teh adjusted r-sqared value is .3287. Usually when the the value is low, it means it’s a bad fit. It’s true in this situation.
#p <- qplot( x = INDEPENDENT_VARIABLE, y = DEPENDENT_VARIABLE, data = mydata) + geom_point()
p <- qplot( x = radio, y = sales, data = mydata) + geom_point()
#Add a trend line plot using the a linear model
#p + geom_smooth(method = "lm", formula = y ~ x)
p + geom_smooth(method = "lm", formula = y ~ x)
mlr2 <- lm( sales ~ radio + tv + newspaper)
summary(mlr2)
Call:
lm(formula = sales ~ radio + tv + newspaper)
Residuals:
Min 1Q Median 3Q Max
-8.8277 -0.8908 0.2418 1.1893 2.8292
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.938889 0.311908 9.422 <2e-16 ***
radio 0.188530 0.008611 21.893 <2e-16 ***
tv 0.045765 0.001395 32.809 <2e-16 ***
newspaper -0.001037 0.005871 -0.177 0.86
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 1.686 on 196 degrees of freedom
Multiple R-squared: 0.8972, Adjusted R-squared: 0.8956
F-statistic: 570.3 on 3 and 196 DF, p-value: < 2.2e-16
MODEL 1
sales_predicted1 = 9.31146 + (0.20250) * (69)
sales_predicted1
[1] 23.28396
MODEL 2
sales_predicted2 = 2.92110 + (0.18799) * (69) + (0.04575) * (255)
sales_predicted2
[1] 27.55866
MODEL 3
sales_predicted3 = 2.938889 + (0.188530) * (69) + (0.045765) * (255) + (-0.001037) * (75)
sales_predicted3
[1] 27.53976
To complete the last task, follow the directions found below. Make sure to screenshot and attach any pictures of the results obtained or any questions asked.
knitr::include_graphics("imgs/sales1.png")
knitr::include_graphics("imgs/sales2.png")
knitr::include_graphics("imgs/sales3.png")