This report simulates and analyzes monthly precipitation data for Czechia over a 10-year period (2010–2019). The analysis uses statistical methods and visualization tools covered in the Statistical Methods course. The aim is to demonstrate how such techniques can be applied to understand trends relevant to water governance and climate adaptation.
Load Required Packages
library(data.table)
Warning: package 'data.table' was built under R version 4.4.3
library(ggplot2)
Warning: package 'ggplot2' was built under R version 4.4.3
library(lubridate)
Warning: package 'lubridate' was built under R version 4.4.3
Attaching package: 'lubridate'
The following objects are masked from 'package:data.table':
hour, isoweek, mday, minute, month, quarter, second, wday, week,
yday, year
The following objects are masked from 'package:base':
date, intersect, setdiff, union
Simulate Monthly Precipitation Data
We generate fake monthly precipitation values using a normal distribution.
set.seed(42)# Create monthly dates for 10 yearsdates <-seq(as.Date("2010-01-01"), as.Date("2019-12-01"), by ="month")# Generate precipitation dataprecip <-rnorm(length(dates), mean =50, sd =20)# Construct data.tableprecip_data <-data.table(date = dates,precip =pmax(round(precip, 1), 0) # avoid negative values)# Add year and month columnsprecip_data[, year :=year(date)]precip_data[, month :=month(date)]# Display first few rowshead(precip_data)
lm_model <-lm(mean_precip ~ year, data = annual_mean)summary(lm_model)
Call:
lm(formula = mean_precip ~ year, data = annual_mean)
Residuals:
Min 1Q Median 3Q Max
-11.7299 -2.2940 0.5146 2.3161 13.3320
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 530.3188 1659.5396 0.320 0.757
year -0.2381 0.8238 -0.289 0.780
Residual standard error: 7.483 on 8 degrees of freedom
Multiple R-squared: 0.01033, Adjusted R-squared: -0.1134
F-statistic: 0.08352 on 1 and 8 DF, p-value: 0.7799
Plot Annual Mean with Trend Line
ggplot(annual_mean, aes(x = year, y = mean_precip)) +geom_point(color ="darkblue", size =3) +geom_smooth(method ="lm", se =TRUE, color ="orange") +labs(title ="Annual Mean Precipitation and Linear Trend",x ="Year",y ="Mean Precipitation (mm)" ) +theme_minimal()
`geom_smooth()` using formula = 'y ~ x'
Precipitation Distribution
Histogram of Monthly Precipitation
ggplot(precip_data, aes(x = precip)) +geom_histogram(binwidth =5, fill ="skyblue", color ="black") +labs(title ="Distribution of Monthly Precipitation",x ="Precipitation (mm)",y ="Frequency" ) +theme_minimal()
Boxplot by Year
ggplot(precip_data, aes(x =factor(year), y = precip)) +geom_boxplot(fill ="lightgreen") +labs(title ="Monthly Precipitation Distribution by Year",x ="Year",y ="Precipitation (mm)" ) +theme_minimal()
Seasonal Analysis
We categorize the months into seasons and analyze average seasonal precipitation.
season mean_precip
<char> <num>
1: Winter 48.85000
2: Spring 52.08333
3: Summer 50.75667
4: Autumn 51.13000
Bar Plot by Season
ggplot(seasonal_avg, aes(x = season, y = mean_precip, fill = season)) +geom_col() +labs(title ="Average Precipitation by Season",x ="Season",y ="Mean Precipitation (mm)" ) +theme_minimal()
Save the Simulated Dataset
fwrite(precip_data, "simulated_precip_data.csv")
Conclusion
This analysis demonstrates how simulated data and basic statistical tools can be used to assess precipitation patterns relevant to water governance. The visualizations and trend analysis are a valuable step toward understanding seasonal water availability, which is key for planning under changing climate conditions in Czechia.