Regression Analysis Summary: NOAAGISSWD Dataset

Overview

Using the NOAAGISSWD dataset, We examined regression models relating disaster outcomes to delta.temp and Year, including all candidate models up to the full second-order form:

1. Code

# 1. Load professor's provided packages and data
source("regday4.pck")
source("regboot1.pck")

wd <- NOAAGISSWD

# 2. Define Candidate Models (up to full 2nd order)
# dt = delta.temp, Y = Year
forms <- list(
  M1  = ~ delta.temp,
  M2  = ~ Year,
  M3  = ~ delta.temp + Year,
  M4  = ~ delta.temp * Year,
  M5  = ~ delta.temp + Year + I(delta.temp^2),
  M6  = ~ delta.temp + Year + I(Year^2),
  M7  = ~ delta.temp + Year + I(delta.temp^2) + I(Year^2),
  M8  = ~ delta.temp * Year + I(delta.temp^2),
  M9  = ~ delta.temp * Year + I(Year^2),
  M10 = ~ delta.temp * Year + I(delta.temp^2) + I(Year^2)
)

# 3. Helper Functions

# PRESS for Linear Models
PRESS_manual <- function(model) {
  e <- residuals(model)
  h <- lm.influence(model)$hat
  sum((e / (1 - h))^2)
}

# Plausibility Check: 95% CI (Est +/- 1.96*SE) excludes 0
# for all non-intercept coefficients
is_plausible <- function(model) {
  sm <- summary(model)$coefficients
  if (is.null(dim(sm))) return(FALSE)
  sm <- sm[rownames(sm) != "(Intercept)", , drop = FALSE]
  if (nrow(sm) == 0) return(FALSE)
  lower <- sm[, 1] - 1.96 * sm[, 2]
  upper <- sm[, 1] + 1.96 * sm[, 2]
  all(lower * upper > 0)
}

# 4. Main Analysis and Selection Logic
# is_logit = TRUE  --> glm binomial + AIC  + logitboot
# is_logit = FALSE --> lm            + PRESS + lmboot
run_disaster_analysis <- function(resp, is_logit = FALSE) {
  cat("\n--- Analyzing:", resp, "---\n")
  cat("Method:", ifelse(is_logit, 
                        "Logistic Regression (logitboot)", 
                        "Linear Regression (lmboot)"), "\n")
  cat("Selection:", ifelse(is_logit, "AIC", "PRESS"), "\n")
  
  # Fit all candidate models
  models <- lapply(forms, function(f) {
    if (is_logit) glm(update(f, as.formula(paste(resp, "~ ."))),
                      data = wd, family = binomial)
    else          lm(update(f, as.formula(paste(resp, "~ ."))),
                     data = wd)
  })
  
  # Print all models with plausibility and metric
  cat("\nAll Candidate Models:\n")
  all_results <- data.frame()
  for (nm in names(models)) {
    mod <- models[[nm]]
    p   <- is_plausible(mod)
    metric_val <- ifelse(is_logit,
                         round(AIC(mod), 3),
                         round(PRESS_manual(mod), 3))
    all_results <- rbind(all_results, data.frame(
      Model     = nm,
      Plausible = p,
      Metric    = metric_val
    ))
  }
  names(all_results)[3] <- ifelse(is_logit, "AIC", "PRESS")
  print(all_results, row.names = FALSE)
  
  # Find plausible models
  plausible_idx <- which(sapply(models, is_plausible))
  
  cat("\nPlausible Models:\n")
  if (length(plausible_idx) == 0) {
    cat("No plausible models found.\n")
    return(NULL)
  }
  print(all_results[all_results$Plausible == TRUE, ], row.names = FALSE)
  
  # Select best by AIC (logistic) or PRESS (linear)
  if (is_logit) {
    metrics   <- sapply(models, AIC)
    best_name <- names(forms)[plausible_idx[which.min(metrics[plausible_idx])]]
  } else {
    metrics   <- sapply(models, PRESS_manual)
    best_name <- names(forms)[plausible_idx[which.min(metrics[plausible_idx])]]
  }
  
  best_form <- as.formula(paste(resp,
                                paste(forms[[best_name]], collapse = "")))
  
  cat("\nBest Plausible Model:", best_name, "\n")
  print(best_form)
  cat("\nBest Model Summary:\n")
  print(summary(models[[best_name]]))
  
  return(best_form)
}

# 5. Identify Best Formulas
# Wildfire and Drought --> Logistic (is_logit = TRUE)
# All Disasters and Severe Storm --> Linear (is_logit = FALSE)
best_wildfire_form <- run_disaster_analysis("Wildfire.Count",
                                            is_logit = TRUE)

## 
## --- Analyzing: Wildfire.Count ---
## Method: Logistic Regression (logitboot) 
## Selection: AIC 
## 
## All Candidate Models:
##  Model Plausible    AIC
##     M1      TRUE 49.406
##     M2      TRUE 47.787
##     M3     FALSE 49.748
##     M4     FALSE 51.717
##     M5     FALSE 51.518
##     M6     FALSE 51.743
##     M7     FALSE 53.121
##     M8     FALSE 52.739
##     M9     FALSE 53.422
##    M10     FALSE 53.831
## 
## Plausible Models:
##  Model Plausible    AIC
##     M1      TRUE 49.406
##     M2      TRUE 47.787
## 
## Best Plausible Model: M2 
## Wildfire.Count ~ Year
## <environment: 0x6388873487e8>
## 
## Best Model Summary:
## 
## Call:
## glm(formula = update(f, as.formula(paste(resp, "~ ."))), family = binomial, 
##     data = wd)
## 
## Coefficients:
##               Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -243.44788   72.16830  -3.373 0.000743 ***
## Year           0.12163    0.03606   3.373 0.000743 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 60.997  on 43  degrees of freedom
## Residual deviance: 43.787  on 42  degrees of freedom
## AIC: 47.787
## 
## Number of Fisher Scoring iterations: 4

best_drought_form  <- run_disaster_analysis("Drought.Count",
                                            is_logit = TRUE)

## 
## --- Analyzing: Drought.Count ---
## Method: Logistic Regression (logitboot) 
## Selection: AIC 
## 
## All Candidate Models:
##  Model Plausible    AIC
##     M1      TRUE 51.335
##     M2      TRUE 51.153
##     M3     FALSE 52.995
##     M4     FALSE 54.529
##     M5     FALSE 54.288
##     M6     FALSE 54.972
##     M7     FALSE 55.882
##     M8     FALSE 56.168
##     M9     FALSE 55.485
##    M10     FALSE 56.580
## 
## Plausible Models:
##  Model Plausible    AIC
##     M1      TRUE 51.335
##     M2      TRUE 51.153
## 
## Best Plausible Model: M2 
## Drought.Count ~ Year
## <environment: 0x638889e8b5f8>
## 
## Best Model Summary:
## 
## Call:
## glm(formula = update(f, as.formula(paste(resp, "~ ."))), family = binomial, 
##     data = wd)
## 
## Coefficients:
##               Estimate Std. Error z value Pr(>|z|)  
## (Intercept) -141.19884   61.85816  -2.283   0.0225 *
## Year           0.07106    0.03097   2.294   0.0218 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 53.413  on 43  degrees of freedom
## Residual deviance: 47.153  on 42  degrees of freedom
## AIC: 51.153
## 
## Number of Fisher Scoring iterations: 4

best_total_form    <- run_disaster_analysis("All.Disasters.Count",
                                            is_logit = FALSE)

## 
## --- Analyzing: All.Disasters.Count ---
## Method: Linear Regression (lmboot) 
## Selection: PRESS 
## 
## All Candidate Models:
##  Model Plausible   PRESS
##     M1      TRUE 651.216
##     M2      TRUE 597.168
##     M3     FALSE 615.889
##     M4     FALSE 374.700
##     M5      TRUE 383.433
##     M6     FALSE 412.641
##     M7      TRUE 370.724
##     M8     FALSE 391.964
##     M9     FALSE 394.161
##    M10      TRUE 333.728
## 
## Plausible Models:
##  Model Plausible   PRESS
##     M1      TRUE 651.216
##     M2      TRUE 597.168
##     M5      TRUE 383.433
##     M7      TRUE 370.724
##    M10      TRUE 333.728
## 
## Best Plausible Model: M10 
## All.Disasters.Count ~ delta.temp * Year + I(delta.temp^2) + I(Year^2)
## <environment: 0x63888a99cdf8>
## 
## Best Model Summary:
## 
## Call:
## lm(formula = update(f, as.formula(paste(resp, "~ ."))), data = wd)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.7011 -1.9905 -0.5961  1.2394  5.8747 
## 
## Coefficients:
##                   Estimate Std. Error t value Pr(>|t|)   
## (Intercept)      2.188e+05  6.920e+04   3.162  0.00307 **
## delta.temp       9.667e+03  3.503e+03   2.760  0.00885 **
## Year            -2.218e+02  7.011e+01  -3.163  0.00306 **
## I(delta.temp^2)  1.386e+02  4.397e+01   3.151  0.00317 **
## I(Year^2)        5.620e-02  1.776e-02   3.164  0.00306 **
## delta.temp:Year -4.908e+00  1.774e+00  -2.766  0.00871 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.479 on 38 degrees of freedom
## Multiple R-squared:  0.8597, Adjusted R-squared:  0.8413 
## F-statistic: 46.58 on 5 and 38 DF,  p-value: 3.433e-15

best_storm_form    <- run_disaster_analysis("Severe.Storm.Count",
                                            is_logit = FALSE)

## 
## --- Analyzing: Severe.Storm.Count ---
## Method: Linear Regression (lmboot) 
## Selection: PRESS 
## 
## All Candidate Models:
##  Model Plausible   PRESS
##     M1      TRUE 308.454
##     M2      TRUE 278.452
##     M3     FALSE 289.904
##     M4     FALSE 149.430
##     M5      TRUE 171.933
##     M6     FALSE 160.778
##     M7      TRUE 154.807
##     M8     FALSE 168.226
##     M9     FALSE 160.851
##    M10      TRUE 143.564
## 
## Plausible Models:
##  Model Plausible   PRESS
##     M1      TRUE 308.454
##     M2      TRUE 278.452
##     M5      TRUE 171.933
##     M7      TRUE 154.807
##    M10      TRUE 143.564
## 
## Best Plausible Model: M10 
## Severe.Storm.Count ~ delta.temp * Year + I(delta.temp^2) + I(Year^2)
## <environment: 0x6388886c8a60>
## 
## Best Model Summary:
## 
## Call:
## lm(formula = update(f, as.formula(paste(resp, "~ ."))), data = wd)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.8189 -1.0013  0.0217  0.9033  3.4247 
## 
## Coefficients:
##                   Estimate Std. Error t value Pr(>|t|)   
## (Intercept)      1.302e+05  4.353e+04   2.991  0.00486 **
## delta.temp       5.147e+03  2.203e+03   2.336  0.02488 * 
## Year            -1.319e+02  4.411e+01  -2.989  0.00488 **
## I(delta.temp^2)  7.484e+01  2.766e+01   2.706  0.01015 * 
## I(Year^2)        3.338e-02  1.117e-02   2.988  0.00490 **
## delta.temp:Year -2.614e+00  1.116e+00  -2.342  0.02453 * 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.56 on 38 degrees of freedom
## Multiple R-squared:  0.8867, Adjusted R-squared:  0.8718 
## F-statistic: 59.48 on 5 and 38 DF,  p-value: < 2.2e-16

############################################################
# 6. Final Bootstrap Validation and Plotting
############################################################

cat("\nGenerating Final Bootstrap Results and Plots...\n")

## 
## Generating Final Bootstrap Results and Plots...

# Wildfire (Logistic Bootstrap)
glm.out        <- glm(best_wildfire_form, data = wd, family = binomial)
wildfire_final <- logitboot(best_wildfire_form, wd, nboot = 1000)

## [1] 500
## [1] 1000

mtext("Wildfire.Count Model", side = 3, line = 3, cex = 1.1)

# Drought (Logistic Bootstrap)
glm.out       <- glm(best_drought_form, data = wd, family = binomial)
drought_final <- logitboot(best_drought_form, wd, nboot = 1000)

## [1] 500
## [1] 1000

mtext("Drought.Count Model", side = 3, line = 3, cex = 1.1)

# All Disasters (Linear Bootstrap)
total_final <- lmboot(best_total_form, wd,
                      nboot = 1000, interval.type = "pred")

## Warning in predict.lm(lm.out0, interval = interval.type, level = 1 - alpha): predictions on current data refer to _future_ responses

## [1] 44  2
## [1] 500
## [1] 1000

mtext("All.Disasters.Count Model", side = 3, line = 3, cex = 1.1)

# Severe Storm (Linear Bootstrap)
storm_final <- lmboot(best_storm_form, wd,
                      nboot = 1000, interval.type = "pred")

## Warning in predict.lm(lm.out0, interval = interval.type, level = 1 - alpha): predictions on current data refer to _future_ responses

## [1] 44  2
## [1] 500
## [1] 1000

mtext("Severe.Storm.Count Model", side = 3, line = 3, cex = 1.1)

############################################################
# 7. Print Final Bootstrap Coefficients
############################################################

cat("\nWildfire.Count Model Coefficients:\n")

## 
## Wildfire.Count Model Coefficients:

print(wildfire_final$coef)

##  (Intercept)         Year 
## -243.4478775    0.1216327

cat("\nDrought.Count Model Coefficients:\n")

## 
## Drought.Count Model Coefficients:

print(drought_final$coef)

##   (Intercept)          Year 
## -141.19883593    0.07106025

cat("\nAll.Disasters.Count Model Coefficients:\n")

## 
## All.Disasters.Count Model Coefficients:

print(total_final$coef)

##     (Intercept)      delta.temp            Year I(delta.temp^2)       I(Year^2) 
##    2.188306e+05    9.667064e+03   -2.217899e+02    1.385640e+02    5.619854e-02 
## delta.temp:Year 
##   -4.908488e+00

cat("\nSevere.Storm.Count Model Coefficients:\n")

## 
## Severe.Storm.Count Model Coefficients:

print(storm_final$coef)

##     (Intercept)      delta.temp            Year I(delta.temp^2)       I(Year^2) 
##    1.302037e+05    5.146517e+03   -1.318537e+02    7.484359e+01    3.338186e-02 
## delta.temp:Year 
##   -2.614145e+00

Regression Analysis Summary: NOAAGISSWD Dataset

Overview

Using the NOAAGISSWD dataset, We examined regression models relating disaster outcomes to delta.temp and Year, including all candidate models up to the full second-order form:

1. Code

Show in New Window

--- Analyzing: Wildfire.Count ---
Method: Logistic Regression (logitboot) 
Selection: AIC 

All Candidate Models:

Plausible Models:

Best Plausible Model: M2 
Wildfire.Count ~ Year
<environment: 0x620f3671e338>

Best Model Summary:

Call:
glm(formula = update(f, as.formula(paste(resp, "~ ."))), family = binomial, 
    data = wd)

Coefficients:
              Estimate Std. Error z value Pr(>|z|)    
(Intercept) -243.44788   72.16830  -3.373 0.000743 ***
Year           0.12163    0.03606   3.373 0.000743 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 60.997  on 43  degrees of freedom
Residual deviance: 43.787  on 42  degrees of freedom
AIC: 47.787

Number of Fisher Scoring iterations: 4


--- Analyzing: Drought.Count ---
Method: Logistic Regression (logitboot) 
Selection: AIC 

All Candidate Models:

Plausible Models:

Best Plausible Model: M2 
Drought.Count ~ Year
<environment: 0x620f333f68a8>

Best Model Summary:

Call:
glm(formula = update(f, as.formula(paste(resp, "~ ."))), family = binomial, 
    data = wd)

Coefficients:
              Estimate Std. Error z value Pr(>|z|)  
(Intercept) -141.19884   61.85816  -2.283   0.0225 *
Year           0.07106    0.03097   2.294   0.0218 *
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 53.413  on 43  degrees of freedom
Residual deviance: 47.153  on 42  degrees of freedom
AIC: 51.153

Number of Fisher Scoring iterations: 4


--- Analyzing: All.Disasters.Count ---
Method: Linear Regression (lmboot) 
Selection: PRESS 

All Candidate Models:

Plausible Models:

Best Plausible Model: M10 
All.Disasters.Count ~ delta.temp * Year + I(delta.temp^2) + I(Year^2)
<environment: 0x620f33f25f48>

Best Model Summary:

Call:
lm(formula = update(f, as.formula(paste(resp, "~ ."))), data = wd)

Residuals:
    Min      1Q  Median      3Q     Max 
-3.7011 -1.9905 -0.5961  1.2394  5.8747 

Coefficients:
                  Estimate Std. Error t value Pr(>|t|)   
(Intercept)      2.188e+05  6.920e+04   3.162  0.00307 **
delta.temp       9.667e+03  3.503e+03   2.760  0.00885 **
Year            -2.218e+02  7.011e+01  -3.163  0.00306 **
I(delta.temp^2)  1.386e+02  4.397e+01   3.151  0.00317 **
I(Year^2)        5.620e-02  1.776e-02   3.164  0.00306 **
delta.temp:Year -4.908e+00  1.774e+00  -2.766  0.00871 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 2.479 on 38 degrees of freedom
Multiple R-squared:  0.8597,    Adjusted R-squared:  0.8413 
F-statistic: 46.58 on 5 and 38 DF,  p-value: 3.433e-15


--- Analyzing: Severe.Storm.Count ---
Method: Linear Regression (lmboot) 
Selection: PRESS 

All Candidate Models:

Plausible Models:

Best Plausible Model: M10 
Severe.Storm.Count ~ delta.temp * Year + I(delta.temp^2) + I(Year^2)
<environment: 0x620f34e24b68>

Best Model Summary:

Call:
lm(formula = update(f, as.formula(paste(resp, "~ ."))), data = wd)

Residuals:
    Min      1Q  Median      3Q     Max 
-2.8189 -1.0013  0.0217  0.9033  3.4247 

Coefficients:
                  Estimate Std. Error t value Pr(>|t|)   
(Intercept)      1.302e+05  4.353e+04   2.991  0.00486 **
delta.temp       5.147e+03  2.203e+03   2.336  0.02488 * 
Year            -1.319e+02  4.411e+01  -2.989  0.00488 **
I(delta.temp^2)  7.484e+01  2.766e+01   2.706  0.01015 * 
I(Year^2)        3.338e-02  1.117e-02   2.988  0.00490 **
delta.temp:Year -2.614e+00  1.116e+00  -2.342  0.02453 * 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.56 on 38 degrees of freedom
Multiple R-squared:  0.8867,    Adjusted R-squared:  0.8718 
F-statistic: 59.48 on 5 and 38 DF,  p-value: < 2.2e-16


Generating Final Bootstrap Results and Plots...
[1] 500
[1] 1000
[1] 500
[1] 1000
G2;H2;Warningh in predict.lm(lm.out0, interval = interval.type, level = 1 - alpha) :
  predictions on current data refer to _future_ responses
g
[1] 44  2
[1] 500
[1] 1000
G2;H2;Warningh in predict.lm(lm.out0, interval = interval.type, level = 1 - alpha) :
  predictions on current data refer to _future_ responses
g
[1] 44  2
[1] 500
[1] 1000

Wildfire.Count Model Coefficients:
 (Intercept)         Year 
-243.4478775    0.1216327 

Drought.Count Model Coefficients:
  (Intercept)          Year 
-141.19883593    0.07106025 

All.Disasters.Count Model Coefficients:
    (Intercept)      delta.temp            Year I(delta.temp^2) 
   2.188306e+05    9.667064e+03   -2.217899e+02    1.385640e+02 
      I(Year^2) delta.temp:Year 
   5.619854e-02   -4.908488e+00 

Severe.Storm.Count Model Coefficients:
    (Intercept)      delta.temp            Year I(delta.temp^2) 
   1.302037e+05    5.146517e+03   -1.318537e+02    7.484359e+01 
      I(Year^2) delta.temp:Year 
   3.338186e-02   -2.614145e+00

R Console

Description:df [10 × 3]

Model <chr>	Plausible <lgl>	AIC <dbl>
M1	TRUE	49.406
M2	TRUE	47.787
M3	FALSE	49.748
M4	FALSE	51.717
M5	FALSE	51.518
M6	FALSE	51.743
M7	FALSE	53.121
M8	FALSE	52.739
M9	FALSE	53.422
M10	FALSE	53.831

1-10 of 10 rows

data.frame

10 x 3

Description:df [2 × 3]

	Model <chr>	Plausible <lgl>	AIC <dbl>
1	M1	TRUE	49.406
2	M2	TRUE	47.787

TRUE

49.406

TRUE

47.787

2 rows

data.frame

2 x 3

Description:df [10 × 3]

Model <chr>	Plausible <lgl>	AIC <dbl>
M1	TRUE	51.335
M2	TRUE	51.153
M3	FALSE	52.995
M4	FALSE	54.529
M5	FALSE	54.288
M6	FALSE	54.972
M7	FALSE	55.882
M8	FALSE	56.168
M9	FALSE	55.485
M10	FALSE	56.580

1-10 of 10 rows

data.frame

10 x 3

Description:df [2 × 3]

	Model <chr>	Plausible <lgl>	AIC <dbl>
1	M1	TRUE	51.335
2	M2	TRUE	51.153

TRUE

51.335

TRUE

51.153

2 rows

data.frame

2 x 3

Description:df [10 × 3]

Model <chr>	Plausible <lgl>	PRESS <dbl>
M1	TRUE	651.216
M2	TRUE	597.168
M3	FALSE	615.889
M4	FALSE	374.700
M5	TRUE	383.433
M6	FALSE	412.641
M7	TRUE	370.724
M8	FALSE	391.964
M9	FALSE	394.161
M10	TRUE	333.728

1-10 of 10 rows

data.frame

10 x 3

Description:df [5 × 3]

	Model <chr>	Plausible <lgl>	PRESS <dbl>
1	M1	TRUE	651.216
2	M2	TRUE	597.168
5	M5	TRUE	383.433
7	M7	TRUE	370.724
10	M10	TRUE	333.728

5 rows

data.frame

5 x 3

Description:df [10 × 3]

Model <chr>	Plausible <lgl>	PRESS <dbl>
M1	TRUE	308.454
M2	TRUE	278.452
M3	FALSE	289.904
M4	FALSE	149.430
M5	TRUE	171.933
M6	FALSE	160.778
M7	TRUE	154.807
M8	FALSE	168.226
M9	FALSE	160.851
M10	TRUE	143.564

1-10 of 10 rows

data.frame

10 x 3

Description:df [5 × 3]

	Model <chr>	Plausible <lgl>	PRESS <dbl>
1	M1	TRUE	308.454
2	M2	TRUE	278.452
5	M5	TRUE	171.933
7	M7	TRUE	154.807
10	M10	TRUE	143.564

5 rows

data.frame

5 x 3

Modify Chunk Options Run All Chunks Above Run Current Chunk

2. Explantions

1. Wildfire.Count (Logistic Regression)

Model: Wildfire occurrence ~ Year
Selection: Best AIC (47.79) among plausible models M1 and M2.
Key results:
- Significant positive Year coefficient (0.122, p < 0.001)
- Strong evidence wildfire probability increases over time
Graph summary:
- Logistic curve fits increasing wildfire occurrence probability well.
- The S-shape of the curve reflects how the probability starts low, increases steadily, and then levels off as it approaches 1
- Bootstrap confidence bands (red lines) capture model uncertainty but confirm upward trend making it statistically reliable instead of due to random variation.
- Observed 0/1 wildfire points align closely with predicted probabilities since its scattered arounf the curve and following the upward curve.

2. Drought.Count (Logistic Regression)

Model: Drought occurrence ~ Year
Selection: Best AIC (51.15) among plausible models M1 and M2.
Key results:
- Positive Year effect (0.071, p ≈ 0.022), indicating rising drought likelihood over years.
Graph summary:
- Logistic fit captures upward trend with wider confidence intervals than wildfire. This shows that drought probability is increasing but at a slower rate.
- The observed points show variability consistently with predictions. This tells us why the confidence range are wider indicating more uncertainty in predicting drought occurrences compared to wildfires.
- However, the overall upward pattern is still clear, and the bootstrap intervals confirm that this increase is statistically meaningful.

3. All.Disasters.Count (Linear Regression, Full 2nd Order)

Model: dt + Y + I(dt²) + I(Y²),
Selection: Lowest PRESS (333.73) among plausible models.
Key results:
- All terms highly significant (p < 0.01).
- Adjusted R² ≈ 0.84 indicating strong model fit.
- Complex nonlinear and interaction effects highlighted.
Graph summary:
- Observed disaster counts (points) well captured by fitted regression line (blue).
- Disaster counts are not just increasing steadily, they are accelerating over time and with temperature changes. The interaction term (delta.temp × Year) adds complexity, showing that the effect of temperature depends on the year. In later years, temperature changes have a stronger impact on disaster counts.
- Bootstrap prediction intervals widen at higher counts, reflecting increased uncertainty and in real - world data.

4. Severe.Storm.Count (Linear Regression, Full 2nd Order)

Model: dt + Y + I(dt²) + I(Y²)
Selection: Lowest PRESS (143.56) among plausible models.
Key results:
- All coefficients significant (p < 0.05).
- Adjusted R² ≈ 0.87, excellent fit.
Graph summary:
- Data fit closely by regression line with prediction bounds enveloping most counts.
- The curve is nonlinear, showing that storm counts increase at an increasing rate over time and temperature.
- Confidence intervals widen with increasing predicted storm counts but most observed points still fall within the bands, confirming good predictive performance..

3. Final selected models

Drought.Count: Year
Wildfire.Count: Year
All.Disasters.Count: delta.temp + Year + I(delta.temp^2) + I(Year^2) + delta.temp:Year
Severe.Storm.Count: delta.temp + Year + I(delta.temp^2) + I(Year^2) + delta.temp:Year

4. Conclusions

Wildfire and Drought counts are best explained by Year alone, indicating a simple linear upward trend over time. Temperature adds no significant independent information once Year is included.
Severe Storm Count and All Disaster Count is best explained by dt + Y + I(dt²) + I(Y²), showing strong nonlinear acceleration. The interaction term dt:Y was not plausible for this response.

This suggests that drought and wildfire counts mainly follow a temporal trend, whereas total disasters and severe storms show more complex nonlinear relationships with both year and temperature anomaly making it a faster and more complex growth patterns.

Assignment 5 - Regression

Tsing Lee & Romika Jani

Regression Analysis Summary: NOAAGISSWD Dataset

Overview

1. Code

Regression Analysis Summary: NOAAGISSWD Dataset

Overview

1. Code

2. Explantions

1. Wildfire.Count (Logistic Regression)

2. Drought.Count (Logistic Regression)

3. All.Disasters.Count (Linear Regression, Full 2nd Order)

4. Severe.Storm.Count (Linear Regression, Full 2nd Order)

3. Final selected models

4. Conclusions