Soil moisture is a critical environmental variable influencing plant growth, surface energy balance, hydrology, and agricultural productivity. Hyperspectral reflectance sensors measure the reflectance of soils at extremely narrow and contiguous wavelength intervals, making them powerful tools for estimating soil properties.
This project focuses on determining whether soil moisture can be predicted using reflectance values from 454–950 nm, combined with soil temperature.
Multiple Linear Regression is used due to the continuous nature of the variables, and all model assumptions are thoroughly checked.
Imaging data that allows for the gathering of numerous wavelengths on the electromagnetic spectrum.
This enables the separation of unique materials as their ability to absorb and reflect light differs.
Healthier soils are known to reflect light more than that of rather un healthy soil. The water content they hold are the key to their unique reflective parameters. As noted by Jambhali et al.,“Water dominates the optical reflectance properties of water bearing materials.”
Hyperspectral data often contain hundreds of highly correlated bands, which can lead to multicollinearity. However, regression models can still produce reliable predictions even when individual coefficients may be unstable.
Level of Significance: \(\alpha = 0.05\)
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## âś” dplyr 1.1.4 âś” readr 2.1.5
## âś” forcats 1.0.1 âś” stringr 1.5.1
## âś” ggplot2 4.0.0 âś” tibble 3.3.0
## âś” lubridate 1.9.4 âś” tidyr 1.3.1
## âś” purrr 1.1.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## âś– dplyr::filter() masks stats::filter()
## âś– dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(car)
## Loading required package: carData
##
## Attaching package: 'car'
##
## The following object is masked from 'package:dplyr':
##
## recode
##
## The following object is masked from 'package:purrr':
##
## some
library(broom)
setwd("~/Downloads/25_Semesters/Fall/DATA101")
soil_data <- read_csv("soilmoisture_dataset.csv")
## Rows: 679 Columns: 129
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl (128): index, soil_moisture, soil_temperature, 454, 458, 462, 466, 470,...
## dttm (1): datetime
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# Working copy
df <- soil_data
# Structure
str(df)
## spc_tbl_ [679 Ă— 129] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
## $ index : num [1:679] 0 1 2 3 4 5 6 7 8 9 ...
## $ datetime : POSIXct[1:679], format: "2017-05-23 14:06:17" "2017-05-23 14:08:17" ...
## $ soil_moisture : num [1:679] 33.5 33.5 33.5 33.3 33.3 ...
## $ soil_temperature: num [1:679] 34.8 35.2 35.4 35 35.3 35.5 35.4 35.1 35 34.8 ...
## $ 454 : num [1:679] 0.0821 0.0795 0.0806 0.078 0.08 ...
## $ 458 : num [1:679] 0.0559 0.0553 0.0541 0.055 0.0553 ...
## $ 462 : num [1:679] 0.05 0.0491 0.0492 0.0491 0.0493 ...
## $ 466 : num [1:679] 0.0479 0.0476 0.0475 0.0479 0.0474 ...
## $ 470 : num [1:679] 0.0475 0.0467 0.0465 0.0469 0.047 ...
## $ 474 : num [1:679] 0.0465 0.0468 0.046 0.0468 0.047 ...
## $ 478 : num [1:679] 0.0467 0.0463 0.0463 0.0468 0.0468 ...
## $ 482 : num [1:679] 0.0468 0.047 0.0469 0.047 0.0471 ...
## $ 486 : num [1:679] 0.0475 0.0477 0.0472 0.0476 0.0481 ...
## $ 490 : num [1:679] 0.0486 0.0483 0.0486 0.0485 0.0482 ...
## $ 494 : num [1:679] 0.0493 0.0491 0.0492 0.0487 0.049 ...
## $ 498 : num [1:679] 0.0503 0.0503 0.0499 0.0499 0.0501 ...
## $ 502 : num [1:679] 0.0513 0.0515 0.0511 0.0514 0.0516 ...
## $ 506 : num [1:679] 0.0532 0.0528 0.0523 0.052 0.053 ...
## $ 510 : num [1:679] 0.0543 0.0543 0.0539 0.0544 0.0544 ...
## $ 514 : num [1:679] 0.0559 0.0554 0.0553 0.056 0.0558 ...
## $ 518 : num [1:679] 0.0575 0.0572 0.0572 0.0573 0.0575 ...
## $ 522 : num [1:679] 0.0593 0.0588 0.059 0.0589 0.0593 ...
## $ 526 : num [1:679] 0.061 0.0607 0.0606 0.0609 0.0611 ...
## $ 530 : num [1:679] 0.0625 0.0621 0.0619 0.0624 0.0625 ...
## $ 534 : num [1:679] 0.0641 0.0639 0.0635 0.0643 0.0642 ...
## $ 538 : num [1:679] 0.0662 0.0656 0.066 0.0652 0.0655 ...
## $ 542 : num [1:679] 0.0678 0.0677 0.0672 0.0672 0.0674 ...
## $ 546 : num [1:679] 0.0695 0.0691 0.0691 0.0688 0.0693 ...
## $ 550 : num [1:679] 0.0713 0.0709 0.0712 0.0709 0.0714 ...
## $ 554 : num [1:679] 0.0729 0.0729 0.0727 0.0733 0.0732 ...
## $ 558 : num [1:679] 0.075 0.0746 0.0744 0.0748 0.0749 ...
## $ 562 : num [1:679] 0.0773 0.0764 0.0767 0.0769 0.077 ...
## $ 566 : num [1:679] 0.0786 0.0787 0.0783 0.0785 0.0789 ...
## $ 570 : num [1:679] 0.0808 0.0802 0.0801 0.0804 0.0808 ...
## $ 574 : num [1:679] 0.0823 0.0821 0.0824 0.082 0.0824 ...
## $ 578 : num [1:679] 0.0849 0.0839 0.0839 0.0838 0.0847 ...
## $ 582 : num [1:679] 0.0865 0.0859 0.0854 0.0858 0.0864 ...
## $ 586 : num [1:679] 0.0879 0.0874 0.0869 0.0871 0.0878 ...
## $ 590 : num [1:679] 0.0893 0.0884 0.0887 0.0886 0.0893 ...
## $ 594 : num [1:679] 0.0908 0.09 0.0903 0.0901 0.0906 ...
## $ 598 : num [1:679] 0.0919 0.0909 0.0915 0.0916 0.0918 ...
## $ 602 : num [1:679] 0.0932 0.0924 0.0927 0.0927 0.0935 ...
## $ 606 : num [1:679] 0.0944 0.0937 0.094 0.094 0.0943 ...
## $ 610 : num [1:679] 0.0955 0.0944 0.0945 0.0949 0.0955 ...
## $ 614 : num [1:679] 0.0965 0.0957 0.0955 0.0958 0.0962 ...
## $ 618 : num [1:679] 0.0973 0.0965 0.0962 0.0969 0.0971 ...
## $ 622 : num [1:679] 0.0986 0.0975 0.0972 0.0979 0.0982 ...
## $ 626 : num [1:679] 0.0997 0.0984 0.0985 0.0989 0.099 ...
## $ 630 : num [1:679] 0.1003 0.0991 0.0993 0.0998 0.0999 ...
## $ 634 : num [1:679] 0.101 0.1 0.1 0.1 0.1 ...
## $ 638 : num [1:679] 0.102 0.101 0.101 0.102 0.102 ...
## $ 642 : num [1:679] 0.103 0.102 0.102 0.103 0.103 ...
## $ 646 : num [1:679] 0.104 0.103 0.104 0.104 0.104 ...
## $ 650 : num [1:679] 0.106 0.104 0.104 0.105 0.105 ...
## $ 654 : num [1:679] 0.107 0.105 0.106 0.106 0.107 ...
## $ 658 : num [1:679] 0.108 0.107 0.107 0.107 0.108 ...
## $ 662 : num [1:679] 0.109 0.107 0.107 0.108 0.109 ...
## $ 666 : num [1:679] 0.11 0.109 0.109 0.109 0.11 ...
## $ 670 : num [1:679] 0.111 0.11 0.11 0.111 0.111 ...
## $ 674 : num [1:679] 0.112 0.112 0.111 0.112 0.112 ...
## $ 678 : num [1:679] 0.114 0.113 0.113 0.113 0.114 ...
## $ 682 : num [1:679] 0.115 0.114 0.114 0.115 0.115 ...
## $ 686 : num [1:679] 0.117 0.115 0.116 0.116 0.116 ...
## $ 690 : num [1:679] 0.118 0.117 0.116 0.117 0.117 ...
## $ 694 : num [1:679] 0.119 0.118 0.118 0.118 0.119 ...
## $ 698 : num [1:679] 0.12 0.119 0.119 0.119 0.12 ...
## $ 702 : num [1:679] 0.122 0.121 0.12 0.12 0.121 ...
## $ 706 : num [1:679] 0.123 0.122 0.121 0.122 0.123 ...
## $ 710 : num [1:679] 0.124 0.123 0.123 0.123 0.124 ...
## $ 714 : num [1:679] 0.125 0.124 0.124 0.125 0.125 ...
## $ 718 : num [1:679] 0.127 0.125 0.126 0.126 0.127 ...
## $ 722 : num [1:679] 0.128 0.127 0.127 0.127 0.128 ...
## $ 726 : num [1:679] 0.129 0.128 0.128 0.129 0.129 ...
## $ 730 : num [1:679] 0.131 0.13 0.13 0.13 0.13 ...
## $ 734 : num [1:679] 0.132 0.131 0.131 0.131 0.131 ...
## $ 738 : num [1:679] 0.133 0.131 0.132 0.132 0.133 ...
## $ 742 : num [1:679] 0.134 0.133 0.133 0.134 0.134 ...
## $ 746 : num [1:679] 0.135 0.135 0.135 0.135 0.135 ...
## $ 750 : num [1:679] 0.137 0.136 0.136 0.136 0.137 ...
## $ 754 : num [1:679] 0.138 0.137 0.137 0.137 0.138 ...
## $ 758 : num [1:679] 0.139 0.137 0.138 0.138 0.139 ...
## $ 762 : num [1:679] 0.14 0.138 0.138 0.138 0.14 ...
## $ 766 : num [1:679] 0.14 0.139 0.139 0.139 0.14 ...
## $ 770 : num [1:679] 0.141 0.14 0.14 0.14 0.141 ...
## $ 774 : num [1:679] 0.142 0.141 0.14 0.141 0.141 ...
## $ 778 : num [1:679] 0.142 0.142 0.141 0.142 0.142 ...
## $ 782 : num [1:679] 0.143 0.142 0.142 0.142 0.143 ...
## $ 786 : num [1:679] 0.144 0.142 0.143 0.143 0.144 ...
## $ 790 : num [1:679] 0.145 0.143 0.143 0.144 0.144 ...
## $ 794 : num [1:679] 0.146 0.144 0.144 0.144 0.145 ...
## $ 798 : num [1:679] 0.146 0.145 0.145 0.145 0.146 ...
## $ 802 : num [1:679] 0.146 0.145 0.145 0.145 0.146 ...
## $ 806 : num [1:679] 0.147 0.145 0.146 0.146 0.147 ...
## $ 810 : num [1:679] 0.147 0.146 0.146 0.146 0.147 ...
## $ 814 : num [1:679] 0.147 0.146 0.146 0.147 0.147 ...
## $ 818 : num [1:679] 0.148 0.146 0.146 0.147 0.147 ...
## $ 822 : num [1:679] 0.148 0.147 0.146 0.147 0.147 ...
## $ 826 : num [1:679] 0.148 0.147 0.146 0.147 0.147 ...
## $ 830 : num [1:679] 0.148 0.147 0.147 0.147 0.148 ...
## [list output truncated]
## - attr(*, "spec")=
## .. cols(
## .. index = col_double(),
## .. datetime = col_datetime(format = ""),
## .. soil_moisture = col_double(),
## .. soil_temperature = col_double(),
## .. `454` = col_double(),
## .. `458` = col_double(),
## .. `462` = col_double(),
## .. `466` = col_double(),
## .. `470` = col_double(),
## .. `474` = col_double(),
## .. `478` = col_double(),
## .. `482` = col_double(),
## .. `486` = col_double(),
## .. `490` = col_double(),
## .. `494` = col_double(),
## .. `498` = col_double(),
## .. `502` = col_double(),
## .. `506` = col_double(),
## .. `510` = col_double(),
## .. `514` = col_double(),
## .. `518` = col_double(),
## .. `522` = col_double(),
## .. `526` = col_double(),
## .. `530` = col_double(),
## .. `534` = col_double(),
## .. `538` = col_double(),
## .. `542` = col_double(),
## .. `546` = col_double(),
## .. `550` = col_double(),
## .. `554` = col_double(),
## .. `558` = col_double(),
## .. `562` = col_double(),
## .. `566` = col_double(),
## .. `570` = col_double(),
## .. `574` = col_double(),
## .. `578` = col_double(),
## .. `582` = col_double(),
## .. `586` = col_double(),
## .. `590` = col_double(),
## .. `594` = col_double(),
## .. `598` = col_double(),
## .. `602` = col_double(),
## .. `606` = col_double(),
## .. `610` = col_double(),
## .. `614` = col_double(),
## .. `618` = col_double(),
## .. `622` = col_double(),
## .. `626` = col_double(),
## .. `630` = col_double(),
## .. `634` = col_double(),
## .. `638` = col_double(),
## .. `642` = col_double(),
## .. `646` = col_double(),
## .. `650` = col_double(),
## .. `654` = col_double(),
## .. `658` = col_double(),
## .. `662` = col_double(),
## .. `666` = col_double(),
## .. `670` = col_double(),
## .. `674` = col_double(),
## .. `678` = col_double(),
## .. `682` = col_double(),
## .. `686` = col_double(),
## .. `690` = col_double(),
## .. `694` = col_double(),
## .. `698` = col_double(),
## .. `702` = col_double(),
## .. `706` = col_double(),
## .. `710` = col_double(),
## .. `714` = col_double(),
## .. `718` = col_double(),
## .. `722` = col_double(),
## .. `726` = col_double(),
## .. `730` = col_double(),
## .. `734` = col_double(),
## .. `738` = col_double(),
## .. `742` = col_double(),
## .. `746` = col_double(),
## .. `750` = col_double(),
## .. `754` = col_double(),
## .. `758` = col_double(),
## .. `762` = col_double(),
## .. `766` = col_double(),
## .. `770` = col_double(),
## .. `774` = col_double(),
## .. `778` = col_double(),
## .. `782` = col_double(),
## .. `786` = col_double(),
## .. `790` = col_double(),
## .. `794` = col_double(),
## .. `798` = col_double(),
## .. `802` = col_double(),
## .. `806` = col_double(),
## .. `810` = col_double(),
## .. `814` = col_double(),
## .. `818` = col_double(),
## .. `822` = col_double(),
## .. `826` = col_double(),
## .. `830` = col_double(),
## .. `834` = col_double(),
## .. `838` = col_double(),
## .. `842` = col_double(),
## .. `846` = col_double(),
## .. `850` = col_double(),
## .. `854` = col_double(),
## .. `858` = col_double(),
## .. `862` = col_double(),
## .. `866` = col_double(),
## .. `870` = col_double(),
## .. `874` = col_double(),
## .. `878` = col_double(),
## .. `882` = col_double(),
## .. `886` = col_double(),
## .. `890` = col_double(),
## .. `894` = col_double(),
## .. `898` = col_double(),
## .. `902` = col_double(),
## .. `906` = col_double(),
## .. `910` = col_double(),
## .. `914` = col_double(),
## .. `918` = col_double(),
## .. `922` = col_double(),
## .. `926` = col_double(),
## .. `930` = col_double(),
## .. `934` = col_double(),
## .. `938` = col_double(),
## .. `942` = col_double(),
## .. `946` = col_double(),
## .. `950` = col_double()
## .. )
## - attr(*, "problems")=<externalptr>
# Column names
names(df)
## [1] "index" "datetime" "soil_moisture"
## [4] "soil_temperature" "454" "458"
## [7] "462" "466" "470"
## [10] "474" "478" "482"
## [13] "486" "490" "494"
## [16] "498" "502" "506"
## [19] "510" "514" "518"
## [22] "522" "526" "530"
## [25] "534" "538" "542"
## [28] "546" "550" "554"
## [31] "558" "562" "566"
## [34] "570" "574" "578"
## [37] "582" "586" "590"
## [40] "594" "598" "602"
## [43] "606" "610" "614"
## [46] "618" "622" "626"
## [49] "630" "634" "638"
## [52] "642" "646" "650"
## [55] "654" "658" "662"
## [58] "666" "670" "674"
## [61] "678" "682" "686"
## [64] "690" "694" "698"
## [67] "702" "706" "710"
## [70] "714" "718" "722"
## [73] "726" "730" "734"
## [76] "738" "742" "746"
## [79] "750" "754" "758"
## [82] "762" "766" "770"
## [85] "774" "778" "782"
## [88] "786" "790" "794"
## [91] "798" "802" "806"
## [94] "810" "814" "818"
## [97] "822" "826" "830"
## [100] "834" "838" "842"
## [103] "846" "850" "854"
## [106] "858" "862" "866"
## [109] "870" "874" "878"
## [112] "882" "886" "890"
## [115] "894" "898" "902"
## [118] "906" "910" "914"
## [121] "918" "922" "926"
## [124] "930" "934" "938"
## [127] "942" "946" "950"
# Preview rows
head(df, 5)
## # A tibble: 5 Ă— 129
## index datetime soil_moisture soil_temperature `454` `458` `462`
## <dbl> <dttm> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 0 2017-05-23 14:06:17 33.5 34.8 0.0821 0.0559 0.0500
## 2 1 2017-05-23 14:08:17 33.5 35.2 0.0795 0.0553 0.0491
## 3 2 2017-05-23 14:10:17 33.5 35.4 0.0806 0.0541 0.0492
## 4 3 2017-05-23 14:12:17 33.3 35 0.0780 0.0550 0.0491
## 5 4 2017-05-23 14:14:17 33.3 35.3 0.0800 0.0553 0.0493
## # ℹ 122 more variables: `466` <dbl>, `470` <dbl>, `474` <dbl>, `478` <dbl>,
## # `482` <dbl>, `486` <dbl>, `490` <dbl>, `494` <dbl>, `498` <dbl>,
## # `502` <dbl>, `506` <dbl>, `510` <dbl>, `514` <dbl>, `518` <dbl>,
## # `522` <dbl>, `526` <dbl>, `530` <dbl>, `534` <dbl>, `538` <dbl>,
## # `542` <dbl>, `546` <dbl>, `550` <dbl>, `554` <dbl>, `558` <dbl>,
## # `562` <dbl>, `566` <dbl>, `570` <dbl>, `574` <dbl>, `578` <dbl>,
## # `582` <dbl>, `586` <dbl>, `590` <dbl>, `594` <dbl>, `598` <dbl>, …
tail(df, 5)
## # A tibble: 5 Ă— 129
## index datetime soil_moisture soil_temperature `454` `458` `462`
## <dbl> <dttm> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 677 2017-05-26 14:00:10 30.0 40.5 0.0956 0.0633 0.0549
## 2 678 2017-05-26 14:02:10 29.8 39.5 0.0952 0.0642 0.0548
## 3 679 2017-05-26 14:04:10 29.8 39.5 0.0956 0.0645 0.0558
## 4 680 2017-05-26 14:06:10 29.9 39.5 0.0950 0.0642 0.0550
## 5 681 2017-05-26 14:08:10 29.8 39.7 0.0977 0.0654 0.0561
## # ℹ 122 more variables: `466` <dbl>, `470` <dbl>, `474` <dbl>, `478` <dbl>,
## # `482` <dbl>, `486` <dbl>, `490` <dbl>, `494` <dbl>, `498` <dbl>,
## # `502` <dbl>, `506` <dbl>, `510` <dbl>, `514` <dbl>, `518` <dbl>,
## # `522` <dbl>, `526` <dbl>, `530` <dbl>, `534` <dbl>, `538` <dbl>,
## # `542` <dbl>, `546` <dbl>, `550` <dbl>, `554` <dbl>, `558` <dbl>,
## # `562` <dbl>, `566` <dbl>, `570` <dbl>, `574` <dbl>, `578` <dbl>,
## # `582` <dbl>, `586` <dbl>, `590` <dbl>, `594` <dbl>, `598` <dbl>, …
# Choose a subset of bands
df_clean <- soil_data |>
select(soil_moisture,
soil_temperature,
`454`, `550`, `650`, `750`, `850`, `950`) |>
drop_na()
head(df_clean )
## # A tibble: 6 Ă— 8
## soil_moisture soil_temperature `454` `550` `650` `750` `850` `950`
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 33.5 34.8 0.0821 0.0713 0.106 0.137 0.150 0.154
## 2 33.5 35.2 0.0795 0.0709 0.104 0.136 0.147 0.157
## 3 33.5 35.4 0.0806 0.0712 0.104 0.136 0.148 0.154
## 4 33.3 35 0.0780 0.0709 0.105 0.136 0.148 0.158
## 5 33.3 35.3 0.0800 0.0714 0.105 0.137 0.148 0.156
## 6 33.2 35.5 0.0815 0.0707 0.105 0.136 0.148 0.155
summary(df_clean $soil_moisture)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 25.50 28.25 31.77 31.57 34.19 42.50
summary(df_clean $soil_temperature)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 26.40 33.60 36.70 37.50 41.15 47.10
A multilinear regression model that predicts soil moisture using:
These wavelengths represent hyperspectral reflectance bands.
The hyperspectral dataset contains reflectance values from 454–950 nm at 4-nm intervals. Bands 454, 550, 650, 750, 850, and 950 nm were selected because they span distinct regions of the electromagnetic spectrum, capturing the major ways soil moisture affects reflectance. Choosing these spaced-apart wavelengths also reduces multicollinearity while preserving essential spectral information.
# Fit the model
multiple_lm <- lm(soil_moisture ~ soil_temperature + `454` + `550` + `650` + `750` + `850` + `950`,
data = df_clean)
# View summary
summary(multiple_lm)
##
## Call:
## lm(formula = soil_moisture ~ soil_temperature + `454` + `550` +
## `650` + `750` + `850` + `950`, data = df_clean)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.6085 -1.0473 -0.2183 1.0229 8.5218
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 57.45896 0.57407 100.091 < 2e-16 ***
## soil_temperature -0.41870 0.02365 -17.704 < 2e-16 ***
## `454` 62.81845 10.96500 5.729 1.53e-08 ***
## `550` 133.74798 35.41145 3.777 0.000173 ***
## `650` -133.93142 41.26045 -3.246 0.001229 **
## `750` 181.59719 48.85625 3.717 0.000218 ***
## `850` -177.82961 35.87794 -4.957 9.10e-07 ***
## `950` -50.62741 15.59957 -3.245 0.001231 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.616 on 671 degrees of freedom
## Multiple R-squared: 0.8055, Adjusted R-squared: 0.8034
## F-statistic: 396.9 on 7 and 671 DF, p-value: < 2.2e-16
When all predictors are zero, soil moisture would be roughly 57%.
As soil temperature increases, soil moisture decreases. For each 1 unit increase in temperature, moisture drops by about 0.42 units.
Positive estimate: higher reflectance increases predicted soil moisture.
Negative estimate: higher reflectance decreases predicted soil moisture.
Significant p-values: all predictors result in values less than the level of signifcance (0.05). Indicating they are significant as predictors in the model.
Based on the results of the R-squared value The model explains about 80% of the variation in soil moisture. Suggesting a strong and meaningful relationship.
The predictions are off by about 1.6 units of soil moisture. However, the model is still highly significant strongly explaing soil moisture.
Ideal outcome: Residuals vs Fitted plot should show random scatter around 0.
Violation: Indcation of pattern or curvature.
# Linearity
plot(multiple_lm, which = 1)
Ideal outcome: Durbin Watson statistic equals roughly 2, meaning residuals are independent.
Violation: Durbin Watson statistic is near 0 or 4 indicates autocorrelation.
# Independence of Observations
durbinWatsonTest(multiple_lm)
## lag Autocorrelation D-W Statistic p-value
## 1 0.8774503 0.2423303 0
## Alternative hypothesis: rho != 0
Ideal outcome: Residuals have roughly equal spread across fitted values.
Violation: Funnel shape indicates heteroscedasticity.
# Homoscedasticity (constant variance)
plot(multiple_lm, which = 3)
*Ideal outcome - Points follow the diagonal line → residuals are approximately normal.
# Normality of Residuals
plot(multiple_lm, which = 2)
Ideal outcome: No points with extremely high leverage or Cook’s distance
Violation: extremely high Cook’s distance
# Residuals vs Leverage
plot(multiple_lm, which = 5)
residuals_model <- resid(multiple_lm)
rmse <- sqrt(mean(residuals_model^2))
rmse
## [1] 1.606691
Some wavelengths increase with moisture (454, 550, 750)
Others decrease with moisture (650, 850, 950)
Overall, the regression model shows a good fit and mostly meets key assumptions. The linearity plot indicates that residuals are mostly randomly dispersed across fitted values, with only slight non-linearity at the lower end. The Durbin–Watson test reveals strong positive autocorrelation in the residuals, suggesting that the independence assumption is violated, possibly due to sequential patterns in the data. The homoscedasticity check shows that residuals have a fairly constant spread across predicted values, supporting the assumption of equal residual variance. The normal Q-Q plot indicates that residuals are approximately normally distributed, validating the use of statistical tests and confidence intervals. Ultimately, the residuals vs leverage plot shows a few extreme points with high leverage or influence. However they were not significant enough to properly affect the model.