Applied Biostatistics - Scoliosis Project

1 Data Description
2 Preparation
- 2.1 Loading Packages
- 2.2 Loading Data
3 Data Preparation
- 3.1 Data Exploration
- 3.2 Check for Normality of Variables
4 Models

Group 07A: Enola Heinzer, Leonie von Moos

Group 07B: Selina Kohler, Razoare Iulia

1 Data Description

The data analyzed stems from an exploratory research study focusing on scoliosis and its diagnostic methods. Exploratory research means that no hypothesis is available and no definitive conclusions can be drawn, without testing for them in a new study setting.

The provided dataset is a combination of two separately aquired datasets. The first one originated from the Larson Study, where data was collected from 72 participants that did not suffer from scoliosis. The second dataset was collected at the Balgrist University Hospital and included 120 participants with a diagnosed or suspected scoliosis. The dataset that will be analysed in the following pages, is limited to 38 participants without and 39 participants with scoliosis. Reasons for this limitation were not given.

The diagnosis as well as the monitoring of scoliosis progression typically require frequent X-ray imaging, which exposes patients to potentially harmful levels of radiation over time. Thus, a new and less damaging measurement method is needed. For this reason two ideas were tested.

The first was to fit a model that correctly predicts whether or not a person suffers from scoliosis, using only parameters that do not require an x-ray image to be taken. In this model the variable scoliosis was considered the response variable.

In a second step a model was fitted to predict the primary Cobb angle of every participant, as this is the diagnostic tool to diagnose scoliosis (>|10B0|). For this prediction only the 38 participants of the Balgrist study could be considered, as no Cobb angle measurements were available for the other study. In this case the primary Cobb angle was considered the response variable.

2 Preparation

2.1 Loading Packages

library(ggplot2)
library(dplyr)
library(plgraphics)
library(jtools)
library(splines)
library(mgcv)

2.2 Loading Data

scoliosis_data <- readRDS('d.scoliosis_PreparedData_ETH.RDS')
str(scoliosis_data)

## 'data.frame':    77 obs. of  24 variables:
##  $ age                        : int  15 17 20 23 16 15 14 16 23 15 ...
##  $ gender                     : int  1 2 1 1 2 1 2 1 1 1 ...
##  $ height                     : int  164 173 158 163 185 167 176 167 157 177 ...
##  $ weight                     : int  46 68 56 51 60 61 59 62 43 63 ...
##  $ primary_cobbAngle          : int  -33 18 15 38 -25 -12 -21 28 -33 54 ...
##  $ cobbAngle_proximal_thoracic: int  -15 -6 -12 NA -14 NA -7 -26 -6 -27 ...
##  $ cobbAngle_main_thoracic    : int  28 18 15 38 21 9 18 28 24 54 ...
##  $ cobbAngle_thoracolumbar    : int  -33 -11 NA -38 -25 -12 -21 -16 -33 -31 ...
##  $ rip_hump                   : num  NA 5 0 10 5 0 5 10 5 5 ...
##  $ lumbar_hump                : num  NA 20 5 13 22 10 7 0 10 5 ...
##  $ RoM                        : num  63.1 54 71.3 69.9 57.8 ...
##  $ inclination_diff           : num  -4.85 5.03 1.06 -7.23 -7.3 ...
##  $ bending_diff               : num  15.84 1.36 -11.15 -3.99 27.23 ...
##  $ relative_bending_asymmetry : num  0.2509 0.0252 -0.1564 -0.057 0.4714 ...
##  $ left_bending_angle         : num  16.44 13.61 20.55 27.66 4.46 ...
##  $ right_bending_angle        : num  32.3 15 9.4 23.7 31.7 ...
##  $ left_inclination_angle     : num  -34 -24.5 -35.1 -38.6 -32.5 ...
##  $ right_inclination_angle    : num  29.1 29.5 36.2 31.3 25.2 ...
##  $ scoliosis                  : num  1 1 1 1 1 1 1 1 1 1 ...
##  $ spine_diagnosis            : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ apex_proximal_thoracic.fac : Factor w/ 11 levels "","-","Th1","Th1/2",..: 7 4 3 2 7 1 11 8 5 8 ...
##  $ apex_main_thoracic.fac     : Factor w/ 14 levels "","Th10","Th10/11",..: 11 6 11 5 13 3 14 2 9 13 ...
##  $ apex_thoracolumbar.fac     : Factor w/ 10 levels "","L1","L1/2",..: 2 7 1 9 3 4 5 5 10 4 ...
##  $ scoliosis.fac              : Factor w/ 2 levels "No","Yes": 2 2 2 2 2 2 2 2 2 2 ...

3 Data Preparation

Rename a column

colnames(scoliosis_data)[colnames(scoliosis_data) == "rip_hump"] <- "rib_hump"

Change one patient to no scoliosis (threshold primary_cobbAngle >= 10B0)

scoliosis_data$scoliosis[abs(scoliosis_data$primary_cobbAngle)<10] <- 0

Spine diagnosis = 1 –> scoliosis = yes

scoliosis_data$scoliosis[abs(scoliosis_data$spine_diagnosis) == 1] <- 1

Create groups for severity of scoliosis

scoliosis_data$severity <- NA
scoliosis_data <- scoliosis_data %>%
  mutate(severity = case_when(
    scoliosis == 0 ~ 'no',  
    abs(primary_cobbAngle) >= 40 ~ 'severe',  
    abs(primary_cobbAngle) >= 20 & abs(primary_cobbAngle) < 40 ~ 'moderate',  
    abs(primary_cobbAngle) >= 10 & abs(primary_cobbAngle) < 20 ~ 'mild',  
    TRUE ~ NA_character_  # Default case: retain the current scoliosis value (if no conditions match)
  ))

3.1 Data Exploration

How is the variable of scoliosis distributed in the dataset?
0: no scoliosis
1: scoliosis

table(scoliosis_data$scoliosis)

## 
##  0  1 
## 38 39

In the beginning the scoliosis cases were categorized according to their severity. This is how they are distributed over all the participants.

table(scoliosis_data$severity)

## 
##     mild moderate       no   severe 
##       13       15       38        9

3.2 Check for Normality of Variables

Bending Difference:

hist(scoliosis_data$bending_diff, main = "Histogram of Bending Difference", xlab = "Angle Difference [B0]", col = "steelblue")

qqnorm(scoliosis_data$bending_diff, main = "Q-Q Plot of Bending Differences")
qqline(scoliosis_data$bending_diff, col = "steelblue")

Primary Cobb Angle:

hist(abs(scoliosis_data$primary_cobbAngle), main = "Histogram of Absolute Primary Cobb Angles", xlab ="Angle [B0]", col = "steelblue")

qqnorm(abs(scoliosis_data$primary_cobbAngle), main = "Q-Q Plot of Absolute Primary Cobb Angles")
qqline(abs(scoliosis_data$primary_cobbAngle), col = "steelblue")

Rib Hump:

hist(scoliosis_data$rib_hump, main = "Histogram of Rib Humps", xlab = "Angle of Trunk Rotation [B0]", col = "steelblue")

qqnorm(scoliosis_data$rib_hump, main = "Q-Q Plot of Rib Humps")
qqline(scoliosis_data$rib_hump, col = "steelblue")

4 Models

Models were fitted for two different response variables. On the one hand a binary model was used to predict the scoliosis status of the person (yes/no), on the other hand a model was fitted for the primary Cobb angle of every participant, as this angle used to not only to diagnose, but to monitor the scoliosis progression as well.

4.1 Categorical Model for Scoliosis Status

As the scoliosis status is a binary variable (0 or 1), the first model was a binary generalizable linear model to predict scoliosis based on the bending difference.

glm_skoli <- glm(scoliosis~abs(bending_diff),
                 data=scoliosis_data, 
                 family="binomial")
summary(glm_skoli)

## 
## Call:
## glm(formula = scoliosis ~ abs(bending_diff), family = "binomial", 
##     data = scoliosis_data)
## 
## Coefficients:
##                   Estimate Std. Error z value Pr(>|z|)  
## (Intercept)       -0.67121    0.40002  -1.678   0.0934 .
## abs(bending_diff)  0.07360    0.03494   2.107   0.0351 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 106.73  on 76  degrees of freedom
## Residual deviance: 101.88  on 75  degrees of freedom
## AIC: 105.88
## 
## Number of Fisher Scoring iterations: 4

ggplot(data=scoliosis_data,
       mapping = aes(y=scoliosis,
                     x=abs(bending_diff))) +
  geom_hline(yintercept= c(0,1), colour = "lightgrey") +
  geom_point(aes(color=scoliosis)) +
  geom_smooth(method = "glm",
              method.args = list(family="binomial"), 
              se = FALSE)

## `geom_smooth()` using formula = 'y ~ x'

It can be observed that this model shows a linear line, and would not be able to distinguish between a healthy or sick individual. This is why the question occurred, if it might be possible to find a difference for the boundary conditions of participants with no scoliosis, and severe scoliosis.

glm_severity <-  glm(scoliosis~abs(bending_diff),
                 data=scoliosis_data %>% 
                   filter(severity %in% c("severe", "no")), 
                 family="binomial")
summary(glm_severity)

## 
## Call:
## glm(formula = scoliosis ~ abs(bending_diff), family = "binomial", 
##     data = scoliosis_data %>% filter(severity %in% c("severe", 
##         "no")))
## 
## Coefficients:
##                   Estimate Std. Error z value Pr(>|z|)    
## (Intercept)       -3.14161    0.89671  -3.503 0.000459 ***
## abs(bending_diff)  0.15444    0.06336   2.438 0.014787 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 45.907  on 46  degrees of freedom
## Residual deviance: 38.746  on 45  degrees of freedom
## AIC: 42.746
## 
## Number of Fisher Scoring iterations: 5

jtools::summ(glm_severity, exp = T)

## MODEL INFO:
## Observations: 47
## Dependent Variable: scoliosis
## Type: Generalized linear model
##   Family: binomial 
##   Link function: logit 
## 
## MODEL FIT:
## χ²(1) = 7.16, p = 0.01
## Pseudo-R² (Cragg-Uhler) = 0.23
## Pseudo-R² (McFadden) = 0.16
## AIC = 42.75, BIC = 46.45 
## 
## Standard errors:MLE
## ------------------------------------------------------------------
##                           exp(Est.)   2.5%   97.5%   z val.      p
## ----------------------- ----------- ------ ------- -------- ------
## (Intercept)                    0.04   0.01    0.25    -3.50   0.00
## abs(bending_diff)              1.17   1.03    1.32     2.44   0.01
## ------------------------------------------------------------------

ggplot(data=scoliosis_data %>% 
         filter(severity %in% c("severe","no")),
       mapping = aes(y=scoliosis,
                     x=abs(bending_diff))) +
  geom_hline(yintercept= c(0,1), colour = "lightgrey") +
  geom_point(aes(color = severity)) +
  geom_smooth(method = "glm",
              method.args = list(family="binomial"), 
              se = FALSE)

## `geom_smooth()` using formula = 'y ~ x'

As shown in the first summary, ‘bending_diff’ seems to have a significant effect on the scoliosis status.
To follow this up and get a more precise prediction, the additional category of moderate scoliosis was incorporated. However, the significant predicting character of ‘bending_diff’ was lost.

The model graph seems to be approximately sigmoid. However the points for the two categories (scoliosis/no scoliosis) are not easily separable. Bigger bending differences seem to predict scoliosis, but there are also some scoliosis patients with a small bending difference. It is possible that these would not be identified by the model. A big problem is the very limited number of severe scoliosis patients in the dataset, which limits the power of the model considerably. To further look into this possible relationship, it is necessary to conduct a new study testing specifically for this hypothesis.

4.2 Continuous Model for Primary Cobb Angle

The second idea was to try and predict the patients primary Cobb angle, which is clinically the most relevant. Per definition this is the most prominent angle of the scoliosis, independent from its location on the spine. It is determined by looking at the x-rays of the spine, which is why a simple model with based on more easily measurable parameters (such as the size of the humps of the spine) would be advantageous for future scoliosis patients.
For this a linear model was tested, again based on the ‘bending_diff’, was tested. This time the variable ‘rib_hump’ was added, as it seemed to be the more promising of the two humps, when looking at correlation plots between the primary Cobb Angle and the ‘rib/lumbar hump’.

Furthermore, the range of motion was added to the model as well, but its p-value increased considerably and was therefore quickly discarded.

It is important to note, that the model can only be trained and evaluated using the data from the Balgrist study (n=38), because no Cobb angle was available for the participants from the Larson study.

4.2.1 Linear Model

lm2param <- lm(primary_cobbAngle~bending_diff +rib_hump, data= scoliosis_data)
summary(lm2param)

## 
## Call:
## lm(formula = primary_cobbAngle ~ bending_diff + rib_hump, data = scoliosis_data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -58.459 -14.275  -0.614  16.113  44.147 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)   
## (Intercept)   -2.8376     5.8480  -0.485  0.63092   
## bending_diff  -0.8235     0.2996  -2.749  0.00989 **
## rib_hump       1.6565     0.6776   2.445  0.02038 * 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 23.96 on 31 degrees of freedom
##   (43 Beobachtungen als fehlend gelöscht)
## Multiple R-squared:  0.3866, Adjusted R-squared:  0.347 
## F-statistic: 9.769 on 2 and 31 DF,  p-value: 0.0005129

It seems like both variables, the ‘bending_diff’ and ‘rib_hump’ displayed a low p-value, and can be considererd significant for the prediction of the Cobb angle.

4.2.1.1 Residual Analysis

Assumptions:

residuals are normally distributed
errors of expected values is zero
errors are homoscendastic (have a constant variance)
errors are independant

plot(lm2param)

plregr(lm2param, plotselect = c(default = 0, qq=1), xvar=FALSE)

ggplot(mapping= aes(y=resid(lm2param), x=fitted(lm2param))) + geom_abline(intercept= 0, slope=0) + 
  geom_point() + geom_smooth()

## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'

plregr(lm2param,
       plotselect = c(resfit = 3,
                      default = 0,
                      absresfit= 0),
       xvar = FALSE)

Expected Value of Errors is Zero:
Looking at the first plot, the modeled line wasn’t straight on the residuals of 0, but not too far off, that the model seemed to perform rather good. The predictions seemed to display residuals mostly in the range of an angle of 10B0. Because a curved trend line was observed, it was assumed that the model wasn’t able to completely capture the relationship between the predictors and the outcome value, indicating that maybe the relationship is not a linear one. –> In this case the assumption of the errors being zero, is not fulfilled.

Because of this quadratic terms were added to the variables, but as soon as one was added, the variable lost its significance for the model.

Normality of Residuals: However, the residuals seemed to mostly be normally distributed as seen in the Q-Q Plot. To ensure that the still prevailing deviation was in an acceptable range, 20 simulated lines were plotted with a normal distribution. This is seen in the last plot from above. These residuals were considered to be roughly normally distributed.

Homoscedasticity (Constant Variability of Residuals): To check if the variability is constant, the square-root transformed absolute values of the residuals were plotted against the fitted values.

ggplot(mapping= aes(y=sqrt(abs(resid(lm2param))), x=fitted(lm2param))) + geom_abline(intercept= 0, slope=0) + 
  geom_point() + geom_smooth()

## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'

plregr(lm2param,
       plotselect = c(default = 0,
                      absresfit= 2),
       xvar = FALSE)

Looking at the grey curve, representing the variance over the residuals, this a distribution seems to be constant.

scoliosis_non_zero <- scoliosis_data %>%
  filter(!is.na(rib_hump))

scoliosis_non_zero$resid.lm2param <- resid(lm2param)
ggplot(data=scoliosis_non_zero, 
       mapping=aes(y=resid.lm2param, x=severity)) + geom_boxplot()

Influential observations:
By taking a look at the outliers, three of them are found and highlighted in the plots (observations 11,25,30). However all of them are below the threshold of a 0.5 Cook’s distance. Therefore none are not big enough to wrongly influence the model, and therefore they do not have to be accounted for.

4.2.2 Regression Splines

Because the relationship between the predictors and the outcome value, didn’t seem to be linear, regression splines were used to try and improve the model.

lm_spline <- lm(primary_cobbAngle ~ ns(bending_diff, df = 4) + 
                  ns(rib_hump, df = 3), 
                data = scoliosis_data)
summary(lm_spline)

## 
## Call:
## lm(formula = primary_cobbAngle ~ ns(bending_diff, df = 4) + ns(rib_hump, 
##     df = 3), data = scoliosis_data)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -61.03 -11.82   1.61  10.28  47.76 
## 
## Coefficients:
##                           Estimate Std. Error t value Pr(>|t|)  
## (Intercept)                  6.388     30.183   0.212   0.8340  
## ns(bending_diff, df = 4)1  -20.890     27.572  -0.758   0.4555  
## ns(bending_diff, df = 4)2  -23.544     24.730  -0.952   0.3498  
## ns(bending_diff, df = 4)3  -19.748     65.156  -0.303   0.7642  
## ns(bending_diff, df = 4)4  -48.707     21.025  -2.317   0.0287 *
## ns(rib_hump, df = 3)1       51.347     23.927   2.146   0.0414 *
## ns(rib_hump, df = 3)2       33.732     23.922   1.410   0.1704  
## ns(rib_hump, df = 3)3       29.547     25.088   1.178   0.2496  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 24.32 on 26 degrees of freedom
##   (43 Beobachtungen als fehlend gelöscht)
## Multiple R-squared:  0.4698, Adjusted R-squared:  0.327 
## F-statistic: 3.291 on 7 and 26 DF,  p-value: 0.01216

Two terms seem to be significant. The residuals were further analysed in this case.

4.2.2.1 Residual Analysis

plot(lm_spline)

plregr(lm_spline, plotselect = c(default = 0, qq=1), xvar=FALSE)

ggplot(mapping= aes(y=resid(lm_spline), x=fitted(lm_spline))) + geom_abline(intercept= 0, slope=0) + 
  geom_point() + geom_smooth()

## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'

plregr(lm2param,
       plotselect = c(resfit = 3,
                      default = 0,
                      absresfit= 0),
       xvar = FALSE)

4.3 Continuous Model for Proximal Thoracic Cobb Angle

lm2paramProx <- lm(cobbAngle_proximal_thoracic~bending_diff +rib_hump, data= scoliosis_data)
summary(lm2paramProx)

## 
## Call:
## lm(formula = cobbAngle_proximal_thoracic ~ bending_diff + rib_hump, 
##     data = scoliosis_data)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -11.2976  -5.0418   0.2205   5.7497  10.2303 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  -10.83585    1.94898  -5.560 1.18e-05 ***
## bending_diff   0.20636    0.09292   2.221  0.03648 *  
## rib_hump      -0.73456    0.21564  -3.406  0.00242 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 6.573 on 23 degrees of freedom
##   (51 Beobachtungen als fehlend gelöscht)
## Multiple R-squared:  0.4885, Adjusted R-squared:  0.444 
## F-statistic: 10.98 on 2 and 23 DF,  p-value: 0.0004489

plot(lm2paramProx)

plregr(lm2paramProx, plotselect = c(default = 0, qq=1), xvar=FALSE)

4.3.1 Regression Splines

Also this time, to find other non-linear relatioships, regression splines were used.

lm_spline <- lm(primary_cobbAngle ~ ns(bending_diff, df = 4) + 
                  ns(rib_hump, df = 3), 
                data = scoliosis_data)
summary(lm_spline)

## 
## Call:
## lm(formula = primary_cobbAngle ~ ns(bending_diff, df = 4) + ns(rib_hump, 
##     df = 3), data = scoliosis_data)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -61.03 -11.82   1.61  10.28  47.76 
## 
## Coefficients:
##                           Estimate Std. Error t value Pr(>|t|)  
## (Intercept)                  6.388     30.183   0.212   0.8340  
## ns(bending_diff, df = 4)1  -20.890     27.572  -0.758   0.4555  
## ns(bending_diff, df = 4)2  -23.544     24.730  -0.952   0.3498  
## ns(bending_diff, df = 4)3  -19.748     65.156  -0.303   0.7642  
## ns(bending_diff, df = 4)4  -48.707     21.025  -2.317   0.0287 *
## ns(rib_hump, df = 3)1       51.347     23.927   2.146   0.0414 *
## ns(rib_hump, df = 3)2       33.732     23.922   1.410   0.1704  
## ns(rib_hump, df = 3)3       29.547     25.088   1.178   0.2496  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 24.32 on 26 degrees of freedom
##   (43 Beobachtungen als fehlend gelöscht)
## Multiple R-squared:  0.4698, Adjusted R-squared:  0.327 
## F-statistic: 3.291 on 7 and 26 DF,  p-value: 0.01216

plot(lm_spline)

plregr(lm_spline, plotselect = c(default = 0, qq=1), xvar=FALSE)

ggplot(mapping= aes(y=resid(lm_spline), x=fitted(lm_spline))) + geom_abline(intercept= 0, slope=0) + 
  geom_point() + geom_smooth()

## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'

plregr(lm2param,
       plotselect = c(resfit = 3,
                      default = 0,
                      absresfit= 0),
       xvar = FALSE)

4.4 Generalized Additive Model for Primary Cobb Angle

As the relationship between the outcome and predictors might not be linear, a generalized additive model (GAM) was tested between the best predictors so far (bending_diff and rib_hump) and the outcome of primary_cobbAngle, to better the models performance.

gam2 <- gam(primary_cobbAngle~s(bending_diff) + s(rib_hump), data=scoliosis_data)
summary(gam2)

## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## primary_cobbAngle ~ s(bending_diff) + s(rib_hump)
## 
## Parametric coefficients:
##             Estimate Std. Error t value Pr(>|t|)  
## (Intercept)    9.471      4.039   2.345   0.0258 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                   edf Ref.df     F p-value   
## s(bending_diff) 1.000  1.000 8.628 0.00631 **
## s(rib_hump)     1.604  1.966 3.207 0.04249 * 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Rank: 17/19
## R-sq.(adj) =  0.369   Deviance explained = 41.9%
## GCV = 620.42  Scale est. = 554.65    n = 34

plot(gam2, pages=1, residuals=TRUE, shade=TRUE)

The graph above shows two different slopes, one of them linear, and one not. As the bending difference was associated with a linear term, it was to be handled as such, in the following GAM code. The rib hump was to be modeled with a smooth term of 1.6.

gam3 <- gam(primary_cobbAngle~bending_diff + s(rib_hump), data=scoliosis_data)
summary(gam3)

## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## primary_cobbAngle ~ bending_diff + s(rib_hump)
## 
## Parametric coefficients:
##              Estimate Std. Error t value Pr(>|t|)   
## (Intercept)    7.5801     4.0899   1.853  0.07356 . 
## bending_diff  -0.8845     0.3011  -2.937  0.00626 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##               edf Ref.df     F p-value  
## s(rib_hump) 1.605  1.967 3.211  0.0425 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.369   Deviance explained = 41.9%
## GCV = 620.43  Scale est. = 554.65    n = 34

plot(gam3, residuals=TRUE, cex=2, shade=TRUE)

gam.check(gam3)

## 
## Method: GCV   Optimizer: magic
## Smoothing parameter selection converged after 4 iterations.
## The RMS GCV score gradient at convergence was 6.775169e-06 .
## The Hessian was positive definite.
## Model rank =  11 / 11 
## 
## Basis dimension (k) checking results. Low p-value (k-index<1) may
## indicate that k is too low, especially if edf is close to k'.
## 
##              k' edf k-index p-value
## s(rib_hump) 9.0 1.6    1.18    0.78