#Does cloud seeding increase rainfall?

#Read and examine file The file is read in and the basic data from the file is reviewed.

clouds <- read.csv("clouds.csv")
str(clouds)
## 'data.frame':    24 obs. of  8 variables:
##  $ X         : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ seeding   : chr  "no" "yes" "yes" "no" ...
##  $ time      : int  0 1 3 4 6 9 18 25 27 28 ...
##  $ sne       : num  1.75 2.7 4.1 2.35 4.25 1.6 1.3 3.35 2.85 2.2 ...
##  $ cloudcover: num  13.4 37.9 3.9 5.3 7.1 6.9 4.6 4.9 12.1 5.2 ...
##  $ prewetness: num  0.274 1.267 0.198 0.526 0.25 ...
##  $ echomotion: chr  "stationary" "moving" "stationary" "moving" ...
##  $ rainfall  : num  12.85 5.52 6.29 6.11 2.45 ...

#Statistical characteristics Some basic functions are performed to view statistical characteristics of the data. Because we were wanting to know if seeding increases rainfall from a cloud, so for that reason the statistical characteristics were focused on rainfall and comparing seed and no seed.

data.frame(clouds)
##     X seeding time  sne cloudcover prewetness echomotion rainfall
## 1   1      no    0 1.75       13.4      0.274 stationary    12.85
## 2   2     yes    1 2.70       37.9      1.267     moving     5.52
## 3   3     yes    3 4.10        3.9      0.198 stationary     6.29
## 4   4      no    4 2.35        5.3      0.526     moving     6.11
## 5   5     yes    6 4.25        7.1      0.250     moving     2.45
## 6   6      no    9 1.60        6.9      0.018 stationary     3.61
## 7   7      no   18 1.30        4.6      0.307     moving     0.47
## 8   8      no   25 3.35        4.9      0.194     moving     4.56
## 9   9      no   27 2.85       12.1      0.751     moving     6.35
## 10 10     yes   28 2.20        5.2      0.084     moving     5.06
## 11 11     yes   29 4.40        4.1      0.236     moving     2.76
## 12 12     yes   32 3.10        2.8      0.214     moving     4.05
## 13 13      no   33 3.95        6.8      0.796     moving     5.74
## 14 14     yes   35 2.90        3.0      0.124     moving     4.84
## 15 15     yes   38 2.05        7.0      0.144     moving    11.86
## 16 16      no   39 4.00       11.3      0.398     moving     4.45
## 17 17      no   53 3.35        4.2      0.237 stationary     3.66
## 18 18     yes   55 3.70        3.3      0.960     moving     4.22
## 19 19      no   56 3.80        2.2      0.230     moving     1.16
## 20 20     yes   59 3.40        6.5      0.142 stationary     5.45
## 21 21     yes   65 3.15        3.1      0.073     moving     2.02
## 22 22      no   68 3.15        2.6      0.136     moving     0.82
## 23 23     yes   82 4.01        8.3      0.123     moving     1.09
## 24 24      no   83 4.65        7.4      0.168     moving     0.28
mean(clouds$rainfall)
## [1] 4.402917
sd(clouds$rainfall)
## [1] 3.109137
mean(clouds$rainfall[clouds$seeding == "yes"])
## [1] 4.634167
mean(clouds$rainfall[clouds$seeding == "no"])
## [1] 4.171667
sd(clouds$rainfall[clouds$seeding == "yes"])
## [1] 2.776841
sd(clouds$rainfall[clouds$seeding == "no"])
## [1] 3.519196
seed <- clouds$seeding == "yes"
no_seed <- clouds$seeding == "no"

#Visualizations of the table Histograms were created to hae a visual representation of the data.

hist(clouds$rainfall[seed], xlab = "Rainfall", main = "Rainfall with Seeding vs without", col = 4)
hist(clouds$rainfall[no_seed], col = 7, add = TRUE)

This shows that there appears to be greater rainfall with seeding because it rises in the center. However, no seeding appears to be more consistent in the amound of rainfall.

#Run a t-test This test is done to see if there is any significant difference between

t.test(clouds$rainfall, seed, alternative="two.sided")
## 
##  Welch Two Sample t-test
## 
## data:  clouds$rainfall and seed
## t = 6.0684, df = 24.24, p-value = 2.763e-06
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  2.576204 5.229630
## sample estimates:
## mean of x mean of y 
##  4.402917  0.500000
t.test(clouds$rainfall, no_seed, alternative="two.sided")
## 
##  Welch Two Sample t-test
## 
## data:  clouds$rainfall and no_seed
## t = 6.0684, df = 24.24, p-value = 2.763e-06
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  2.576204 5.229630
## sample estimates:
## mean of x mean of y 
##  4.402917  0.500000

There appears to be no significant difference.

#Multiple Linear Regression Model A multiple linear regression model is created and includes multiple columns from the table in order to gather the most information and see what has an impact on rainfall.

cloud2 <- lm(clouds$rainfall ~ seed + clouds$cloudcover + clouds$prewetness + clouds$echomotion + clouds$sne)
cloud2
## 
## Call:
## lm(formula = clouds$rainfall ~ seed + clouds$cloudcover + clouds$prewetness + 
##     clouds$echomotion + clouds$sne)
## 
## Coefficients:
##                 (Intercept)                     seedTRUE  
##                     6.37680                      1.12011  
##           clouds$cloudcover            clouds$prewetness  
##                     0.01821                      2.55109  
## clouds$echomotionstationary                   clouds$sne  
##                     2.59855                     -1.27530
summary(cloud2)
## 
## Call:
## lm(formula = clouds$rainfall ~ seed + clouds$cloudcover + clouds$prewetness + 
##     clouds$echomotion + clouds$sne)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -5.1158 -1.7078 -0.2422  1.3368  6.4827 
## 
## Coefficients:
##                             Estimate Std. Error t value Pr(>|t|)  
## (Intercept)                  6.37680    2.43432   2.620   0.0174 *
## seedTRUE                     1.12011    1.20725   0.928   0.3658  
## clouds$cloudcover            0.01821    0.11508   0.158   0.8761  
## clouds$prewetness            2.55109    2.70090   0.945   0.3574  
## clouds$echomotionstationary  2.59855    1.54090   1.686   0.1090  
## clouds$sne                  -1.27530    0.68015  -1.875   0.0771 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.855 on 18 degrees of freedom
## Multiple R-squared:  0.3403, Adjusted R-squared:  0.157 
## F-statistic: 1.857 on 5 and 18 DF,  p-value: 0.1524
anova(cloud2)
## Analysis of Variance Table
## 
## Response: clouds$rainfall
##                   Df  Sum Sq Mean Sq F value  Pr(>F)  
## seed               1   1.283  1.2834  0.1575 0.69613  
## clouds$cloudcover  1  15.738 15.7377  1.9313 0.18157  
## clouds$prewetness  1   0.003  0.0027  0.0003 0.98557  
## clouds$echomotion  1  29.985 29.9853  3.6798 0.07108 .
## clouds$sne         1  28.649 28.6491  3.5158 0.07711 .
## Residuals         18 146.677  8.1487                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Based off of the numbers, it appears that sne and cloud cover have a relationship with rainfall and impact it. While it appears that echomotion also hugely impacts it, however that is (to my understanding) used more for recording and measuring rather than an influence for the rainfall.

#SNE with Rainfall

lm(clouds$sne~no_seed)
## 
## Call:
## lm(formula = clouds$sne ~ no_seed)
## 
## Coefficients:
## (Intercept)  no_seedTRUE  
##      3.3300      -0.3217
lm(clouds$sne~seed)
## 
## Call:
## lm(formula = clouds$sne ~ seed)
## 
## Coefficients:
## (Intercept)     seedTRUE  
##      3.0083       0.3217
hist(clouds$sne[seed], col = 'purple')
hist(clouds$sne[no_seed], col = 'green', add = TRUE)

summary(lm(clouds$sne~no_seed))
## 
## Call:
## lm(formula = clouds$sne ~ no_seed)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.7083 -0.6371  0.1058  0.7754  1.6417 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   3.3300     0.2694  12.361 2.25e-11 ***
## no_seedTRUE  -0.3217     0.3810  -0.844    0.408    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9332 on 22 degrees of freedom
## Multiple R-squared:  0.03138,    Adjusted R-squared:  -0.01264 
## F-statistic: 0.7128 on 1 and 22 DF,  p-value: 0.4076
summary(lm(clouds$sne~seed))
## 
## Call:
## lm(formula = clouds$sne ~ seed)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.7083 -0.6371  0.1058  0.7754  1.6417 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   3.0083     0.2694  11.167 1.56e-10 ***
## seedTRUE      0.3217     0.3810   0.844    0.408    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9332 on 22 degrees of freedom
## Multiple R-squared:  0.03138,    Adjusted R-squared:  -0.01264 
## F-statistic: 0.7128 on 1 and 22 DF,  p-value: 0.4076

According to this, it appears that higher SNE leads to more seeding and therefore more rainfall. This appears to have a steeper, more exponential growth. However, with SNE, the no seeding appears to be more consistent in the amount of rainfall that is received. Adding to this, it appears that rainfall drops with no seeding and increased SNE. There is a drop in the seeding graph as well, hwoever it almost doubles right afterwards which would suggest that SNE is important when buying a house.

#Results It appears that seeding does lead to more rainfall. From the data presented and extracted here, it shows that with seeding, rainfall generally increases whereas with no seeding it can lead to increased rainfall but it is not as significant.