options(warn = -1)feature_selection
library(dplyr)
Attaching package: 'dplyr'
The following objects are masked from 'package:stats':
filter, lag
The following objects are masked from 'package:base':
intersect, setdiff, setequal, union
library(lmerTest)Loading required package: lme4
Loading required package: Matrix
Attaching package: 'lmerTest'
The following object is masked from 'package:lme4':
lmer
The following object is masked from 'package:stats':
step
library(tidyverse)── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ forcats 1.0.0 ✔ readr 2.1.5
✔ ggplot2 3.5.1 ✔ stringr 1.5.1
✔ lubridate 1.9.3 ✔ tibble 3.2.1
✔ purrr 1.0.2 ✔ tidyr 1.3.1
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ tidyr::expand() masks Matrix::expand()
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
✖ tidyr::pack() masks Matrix::pack()
✖ tidyr::unpack() masks Matrix::unpack()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(modelr)
library(purrr)
library(emmeans)Welcome to emmeans.
Caution: You lose important information if you filter this package's results.
See '? untidy'
library(gridExtra)
Attaching package: 'gridExtra'
The following object is masked from 'package:dplyr':
combine
library(writexl)
library(gt)
library(webshot2)
library(broom.mixed)
library(ggplot2)
library(isotree)
load("Z:/Isaac/Visual Features/1-5/step2.RData")below are the repeatability tables
these values are from the model with mean value
print(df_10min_random_vars)# A tibble: 17 × 4
feature sow Residual repeatability
<fct> <dbl> <dbl> <dbl>
1 Area 10653. 5575. 0.656
2 Centroid.X 23.3 21.1 0.525
3 Centroid.Y 16.7 11.2 0.597
4 Concavity 0.0310 0.0484 0.390
5 Convex.Area 12243. 8730. 0.584
6 Convex.Perimeter 90.1 53.8 0.626
7 Eccentricity 0.0266 0.0506 0.345
8 Elasticity 0.159 0.143 0.526
9 Elongation 0.144 0.239 0.376
10 Height 22.0 27.0 0.450
11 Major.Axis.Length 34.0 22.2 0.605
12 Minor.Axis.Length 22.1 27.0 0.450
13 Perimeter 173. 197. 0.467
14 Rightmost.X 27.4 14.7 0.650
15 Rightmost.Y 23.6 46.0 0.339
16 Roundness 0.0956 0.0871 0.523
17 Width 34.0 22.2 0.605
print(df_20min_random_vars)# A tibble: 17 × 4
feature sow Residual repeatability
<fct> <dbl> <dbl> <dbl>
1 Area 10606. 5288. 0.667
2 Centroid.X 23.3 19.8 0.541
3 Centroid.Y 16.6 10.5 0.614
4 Concavity 0.0302 0.0448 0.403
5 Convex.Area 12142. 8179. 0.598
6 Convex.Perimeter 89.8 50.0 0.642
7 Eccentricity 0.0266 0.0479 0.357
8 Elasticity 0.159 0.134 0.542
9 Elongation 0.143 0.224 0.390
10 Height 21.6 25.3 0.461
11 Major.Axis.Length 34.1 20.8 0.621
12 Minor.Axis.Length 21.6 25.2 0.461
13 Perimeter 170. 184. 0.480
14 Rightmost.X 27.2 13.6 0.667
15 Rightmost.Y 23.4 42.0 0.358
16 Roundness 0.0952 0.0806 0.541
17 Width 34.1 20.8 0.621
print(df_30min_random_vars)# A tibble: 17 × 4
feature sow Residual repeatability
<fct> <dbl> <dbl> <dbl>
1 Area 10549. 5096. 0.674
2 Centroid.X 23.2 19.0 0.550
3 Centroid.Y 16.4 9.87 0.624
4 Concavity 0.0302 0.0425 0.415
5 Convex.Area 12109. 7788. 0.609
6 Convex.Perimeter 89.0 47.6 0.651
7 Eccentricity 0.0266 0.0462 0.365
8 Elasticity 0.159 0.130 0.550
9 Elongation 0.144 0.215 0.402
10 Height 21.7 24.0 0.475
11 Major.Axis.Length 33.6 20.0 0.627
12 Minor.Axis.Length 21.8 24.0 0.475
13 Perimeter 169. 177. 0.489
14 Rightmost.X 26.7 12.9 0.675
15 Rightmost.Y 23.1 39.6 0.369
16 Roundness 0.0950 0.0770 0.552
17 Width 33.7 20.0 0.627
print(df_60min_random_vars)# A tibble: 17 × 4
feature sow Residual repeatability
<fct> <dbl> <dbl> <dbl>
1 Area 10465. 4669. 0.691
2 Centroid.X 23.2 17.8 0.567
3 Centroid.Y 16.5 8.48 0.660
4 Concavity 0.0292 0.0384 0.432
5 Convex.Area 11981. 7002. 0.631
6 Convex.Perimeter 89.7 42.9 0.677
7 Eccentricity 0.0261 0.0425 0.381
8 Elasticity 0.159 0.119 0.573
9 Elongation 0.141 0.195 0.419
10 Height 20.8 21.7 0.490
11 Major.Axis.Length 34.4 18.4 0.651
12 Minor.Axis.Length 20.8 21.7 0.490
13 Perimeter 168. 161. 0.511
14 Rightmost.X 27.5 11.7 0.701
15 Rightmost.Y 23.0 34.1 0.402
16 Roundness 0.0954 0.0701 0.576
17 Width 34.5 18.4 0.651
With these values, we are choosing to continue to look at for the mean model: Rightmost.Y, Eccentricity, Elongation, Concavity, Height, Minor.Axis.Length
these values are from the model with var value
print(df_10min_random_vars_var)# A tibble: 17 × 4
feature sow Residual repeatability
<fct> <dbl> <dbl> <dbl>
1 Area 1.69e+7 2.39e+7 0.414
2 Centroid.X 3.14e+2 5.78e+2 0.352
3 Centroid.Y 3.10e+1 6.56e+1 0.321
4 Concavity 1.33e-3 1.49e-3 0.472
5 Convex.Area 1.54e+7 5.00e+7 0.236
6 Convex.Perimeter 2.39e+3 3.25e+3 0.424
7 Eccentricity 1.17e-3 1.68e-3 0.410
8 Elasticity 1.16e-2 1.91e-2 0.377
9 Elongation 1.24e-2 3.20e-2 0.279
10 Height 6.15e+1 3.89e+2 0.137
11 Major.Axis.Length 6.59e+2 7.48e+2 0.468
12 Minor.Axis.Length 6.15e+1 3.89e+2 0.137
13 Perimeter 1.91e+4 3.60e+4 0.346
14 Rightmost.X 4.72e+2 5.53e+2 0.460
15 Rightmost.Y 6.99e+2 1.74e+3 0.286
16 Roundness 2.59e-3 6.64e-3 0.281
17 Width 6.64e+2 7.50e+2 0.470
print(df_20min_random_vars_var)# A tibble: 17 × 4
feature sow Residual repeatability
<fct> <dbl> <dbl> <dbl>
1 Area 1.82e+7 2.52e+7 0.419
2 Centroid.X 3.10e+2 5.62e+2 0.356
3 Centroid.Y 3.40e+1 7.69e+1 0.306
4 Concavity 1.34e-3 1.58e-3 0.459
5 Convex.Area 1.29e+7 5.26e+7 0.197
6 Convex.Perimeter 2.61e+3 3.26e+3 0.445
7 Eccentricity 1.20e-3 1.85e-3 0.393
8 Elasticity 1.39e-2 1.98e-2 0.412
9 Elongation 1.16e-2 3.43e-2 0.254
10 Height 7.92e+1 4.21e+2 0.158
11 Major.Axis.Length 7.27e+2 7.23e+2 0.501
12 Minor.Axis.Length 7.92e+1 4.21e+2 0.158
13 Perimeter 2.28e+4 3.81e+4 0.374
14 Rightmost.X 5.02e+2 5.35e+2 0.484
15 Rightmost.Y 8.45e+2 1.74e+3 0.327
16 Roundness 2.72e-3 6.75e-3 0.287
17 Width 7.31e+2 7.25e+2 0.502
print(df_30min_random_vars_var)# A tibble: 17 × 4
feature sow Residual repeatability
<fct> <dbl> <dbl> <dbl>
1 Area 1.61e+7 2.61e+7 0.381
2 Centroid.X 3.69e+2 5.50e+2 0.401
3 Centroid.Y 3.04e+1 8.18e+1 0.271
4 Concavity 1.37e-3 1.60e-3 0.461
5 Convex.Area 1.21e+7 5.42e+7 0.182
6 Convex.Perimeter 2.59e+3 3.28e+3 0.442
7 Eccentricity 1.36e-3 1.92e-3 0.415
8 Elasticity 1.47e-2 1.90e-2 0.435
9 Elongation 1.37e-2 3.48e-2 0.283
10 Height 1.01e+2 4.31e+2 0.190
11 Major.Axis.Length 7.44e+2 7.08e+2 0.512
12 Minor.Axis.Length 1.01e+2 4.31e+2 0.189
13 Perimeter 2.41e+4 3.71e+4 0.394
14 Rightmost.X 5.69e+2 5.44e+2 0.511
15 Rightmost.Y 9.27e+2 1.68e+3 0.356
16 Roundness 2.75e-3 6.52e-3 0.296
17 Width 7.50e+2 7.10e+2 0.513
print(df_60min_random_vars_var)# A tibble: 17 × 4
feature sow Residual repeatability
<fct> <dbl> <dbl> <dbl>
1 Area 1.60e+7 2.78e+7 0.366
2 Centroid.X 3.79e+2 5.10e+2 0.427
3 Centroid.Y 4.15e+1 8.91e+1 0.318
4 Concavity 1.15e-3 1.53e-3 0.428
5 Convex.Area 1.49e+7 5.59e+7 0.210
6 Convex.Perimeter 2.82e+3 3.26e+3 0.464
7 Eccentricity 1.56e-3 2.06e-3 0.431
8 Elasticity 1.67e-2 1.92e-2 0.464
9 Elongation 1.52e-2 3.46e-2 0.306
10 Height 1.42e+2 4.32e+2 0.247
11 Major.Axis.Length 8.18e+2 6.61e+2 0.553
12 Minor.Axis.Length 1.42e+2 4.32e+2 0.247
13 Perimeter 2.79e+4 3.82e+4 0.422
14 Rightmost.X 6.64e+2 5.13e+2 0.564
15 Rightmost.Y 1.07e+3 1.63e+3 0.396
16 Roundness 2.79e-3 6.06e-3 0.315
17 Width 8.25e+2 6.63e+2 0.554
With these values, we are choosing to continue to look at for the var model: Convex.Area, Minor.Axis.Length, Height, Centroid.Y, Elongation, Roundness
The following are the residual plots for each of the features for the different time windows
aug_res_10_filt <- aug_res_10 %>%
group_by(sow) %>%
filter(n() > 2000) %>%
ungroup()
aug_res_20_filt <- aug_res_20 %>%
group_by(sow) %>%
filter(n() > 1100) %>%
ungroup()
aug_res_30_filt <- aug_res_30 %>%
group_by(sow) %>%
filter(n() > 700) %>%
ungroup()
aug_res_60_filt <- aug_res_60 %>%
group_by(sow) %>%
filter(n() > 400) %>%
ungroup()plots for the mean value residuals accounting for hour of the day
options(repr.plot.width = 16, repr.plot.height = 250)
library(ggforce)
for (i in 1:5){
p1<- ggplot(aug_res_10_filt,aes(x=ttf,y=.resid,color=sow))+
geom_smooth(se=F)+
geom_smooth(aes(x=ttf,y=.resid),linewidth=1.2,color="black")+
facet_wrap_paginate(~feature,scales="free",ncol=2,nrow=2,page=i)+
ggtitle("10 min window mean")
print(p1)
p2<- ggplot(aug_res_20_filt,aes(x=ttf,y=.resid,color=sow))+
geom_smooth(se=F)+
geom_smooth(aes(x=ttf,y=.resid),linewidth=1.2,color="black")+
facet_wrap_paginate(~feature,scales="free",ncol=2,nrow=2,page=i)+
ggtitle("20 min window mean")
print(p2)
p3<- ggplot(aug_res_30_filt,aes(x=ttf,y=.resid,color=sow))+
geom_smooth(se=F)+
geom_smooth(aes(x=ttf,y=.resid),linewidth=1.2,color="black")+
facet_wrap_paginate(~feature,scales="free",ncol=2,nrow=2,page=i)+
ggtitle("30 min window mean")
print(p3)
p4<- ggplot(aug_res_60_filt,aes(x=ttf,y=.resid,color=sow))+
geom_smooth(se=F)+
geom_smooth(aes(x=ttf,y=.resid),linewidth=1.2,color="black")+
facet_wrap_paginate(~feature,scales="free",ncol=2,nrow=2,page=i)+
ggtitle("60 min window mean")
print(p4)
}`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
`geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
`geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
`geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
`geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
`geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
`geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
`geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
`geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
`geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
`geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
`geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
`geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
`geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
`geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
`geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
`geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
`geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
`geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
`geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
`geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
After seeing these plots with the model with the mean value we are going to continue to look at the following features: Width - all windows. Consistent then jumps and departs from trend Rightmost.X - all windows. Consistent then departs from trend
aug_res_10_var_filt <- aug_res_10_var %>%
group_by(sow) %>%
filter(n() > 2000) %>%
ungroup()
aug_res_20_var_filt <- aug_res_20_var %>%
group_by(sow) %>%
filter(n() > 1100) %>%
ungroup()
aug_res_30_var_filt <- aug_res_30_var %>%
group_by(sow) %>%
filter(n() > 700) %>%
ungroup()
aug_res_60_var_filt <- aug_res_60_var %>%
group_by(sow) %>%
filter(n() > 400) %>%
ungroup()plots for the var value residuals accounting for hour of the day
for (i in 1:5){
p1<- ggplot(aug_res_10_var_filt,aes(x=ttf,y=.resid,color=sow))+
geom_smooth(se=F)+
geom_smooth(aes(x=ttf,y=.resid),linewidth=1.2,color="black")+
facet_wrap_paginate(~feature,scales="free",ncol=2,nrow=2,page=i)+
ggtitle("10 min window var")
print(p1)
p2<- ggplot(aug_res_20_var_filt,aes(x=ttf,y=.resid,color=sow))+
geom_smooth(se=F)+
geom_smooth(aes(x=ttf,y=.resid),linewidth=1.2,color="black")+
facet_wrap_paginate(~feature,scales="free",ncol=2,nrow=2,page=i)+
ggtitle("20 min window var")
print(p2)
p3<- ggplot(aug_res_30_var_filt,aes(x=ttf,y=.resid,color=sow))+
geom_smooth(se=F)+
geom_smooth(aes(x=ttf,y=.resid),linewidth=1.2,color="black")+
facet_wrap_paginate(~feature,scales="free",ncol=2,nrow=2,page=i)+
ggtitle("30 min window var")
print(p3)
p4<- ggplot(aug_res_60_var_filt,aes(x=ttf,y=.resid,color=sow))+
geom_smooth(se=F)+
geom_smooth(aes(x=ttf,y=.resid),linewidth=1.2,color="black")+
facet_wrap_paginate(~feature,scales="free",ncol=2,nrow=2,page=i)+
ggtitle("60 min window var")
print(p4)
}`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
`geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
`geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
`geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
`geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
`geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
`geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
`geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
`geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
`geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
`geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
`geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
`geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
`geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
`geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
`geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
`geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
`geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
`geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
`geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
`geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
After seeing these plots with the model with the var value we are going to continue to look at the following features: Centroid.X - all windows. very consistent then departs from trendline Height - all windows. very consistent then departs from trendline Major.Axis.Length - all windows but 60. very consistent then departs from trendline Rightmost.X - all windows but 60. very consistent then departs from trendline Width - all windows but 60. very consistent then departs from trendline
All features of interest:
Mean
from repeatability table - all windows
Rightmost.Y, Eccentricity, Elongation, Concavity, Height, Minor.Axis.Length
from plots of residuals
Width - all windows. Consistent then jumps and departs from trend Rightmost.X - all windows. Consistent then departs from trend
Var
from repeatability table - all windows
Convex.Area, Minor.Axis.Length, Height, Centroid.Y, Elongation, Roundness
from plots of residuals
Centroid.X - all windows. very consistent then departs from trendline Height - all windows. very consistent then departs from trendline Major.Axis.Length - all windows but 60. very consistent then departs from trendline Rightmost.X - all windows but 60. very consistent then departs from trendline Width - all windows but 60. very consistent then departs from trendline