Solar radiation variables of interest

We compare two datasets for solar radiation in California in the year 2010. In future work, we will compare multiple years for the same geographic area for which we have overlapping years of data (1998-2010?). The datasets in consideratin come from two sources: NA-CORDEX (run through NCAR) and the National Solar Radiation Data Base (NSRDB) housed at NREL.

Below are 36 plots representing the monthly averages of Global Horizontal Irradiance (GHI), DHI and DNI in California. All three exhibit similar patterns in that winter months have lower values of GHI, DNI and DHI while summer months see an increase of the three. We expect this as GHI can be calculated using the DHI and DNI through the following equality:

\[ GHI = DHI + DNI*cos(\theta) \]

where \(\theta\) is the solar zenith angle, that is vertically above a location is 0 degrees and horizontal is 90 degrees.



an image caption Source: Ultimate Funny Dog Videos Compilation 2013.


Therefore, as the seasons change the sun’s position relative to the earth shifts, we expect changes in solar radiation. Daily, we obviously expect more solar radiation at midday. The plots below exhibit the seasonal changes mentioned here: June, July and August exhibit peak values for GHI while the winter months of December, January and February the lowset values. There is some variation throughout the state as the seasons change, as expected. Northern California shows smaller values for solar radiation compared to souther California, particularly in the fall and spring seasons (shoulder seasons). During summer and winter, the variation is less apparent.

Exploratory plots

The following plots are not on the same spatial scale but still exhibit similar variation and solar radiation patterns throughout the year. The GHI variable from the NSRDB dataset (on a 4km x 4km grid) is plotted first followed by the downwelling short-wave solar radiation from the NA-CORDEX WRF ERA-Int driven runs on a 22km x 22km grid.



Visually both the downwelling solar radiation and GHI values are similar to one another across months however they are on different spatial scales. We plot the raw difference between the two.

In general, the difference hovers around zero with some more extreme areas in the state. The spring months of March through April tend to be overrestimated by the rsds variable, particularly in the mountain range areas of California (mid-east and northeast sections of CA). Winter months are more similar between the two datasets. Overall, the NA-CORDEX variable tends to underestimate the value observed through NSRDB.


Considering basis functions

Since the two datasets are not currently formatted to be at the same resolution, we can compare basis functions to understand how both the two variables, GHI and downwelling shortwave radiation, vary seasonally over California. Given a data matrix, \(X\) of dimension \(m\times n\), we can decompose the data matrix into the SVD:

\[ X = U\Sigma V^T \]

where \(\Sigma\) is the \(m\times n\) diagonal matrix with ordered singular values as the diagonal components. The matrices \(U\) and \(V\) are orthogonal matrices of dimension \(m\times n\) and \(n\times n\), respectively. The ordered columns of the matrix \(U\) describe the variability of the columns of \(X\) while the ordered rows of the matrix \(V\) describes the variability in the rows of \(X\). Additionally, the columns of \(V\) will be used as the bases functions so that a row of \(X\) can be described by:

\[ x(t) = \sum_{k=1}^Kc_k\phi_k(t) \]

Each row of our data matrix \(X\) represents a single location within the domain, California. The columns of \(X\) represent a specific time frame, in this case monthly time frames. Therefore, the columns of the matrix \(U\) will describe the variability of the time frames (months) while the matrix \(V\) describes the variability across each location

By looking at the respective SVD’s of each data set, we can determine how the variability of the radiation variables measured is similar or different across space and time. If the resulting plots are similar, then we expect that both datasets capture similar variability and seasonality across space and time, and therefore similar or nearly exact representations of the variability of solar radiation data. The first two plots compare the first four columns of the matrix V calculated from the SVD of the two data sets. They show similar variability and patterns across both time.

Y <- matrix(nsrdb_order$GHI, byrow = TRUE, ncol = 12)
Y_svd <- svd(Y)
U <- Y_svd$u
D <- diag(Y_svd$d)
V <- Y_svd$v
B <- V %*% D

Y2 <- matrix(nacordex_order$rsds,ncol = 12, byrow = TRUE)
Y_svd2 <- svd(Y2)
U2 <- Y_svd2$u
D2 <- diag(Y_svd2$d)
V2 <- Y_svd2$v
B2 <- V2 %*% D2
par(mfrow=c(1,2))

par(oma = c(0, 0, 4, 0))
matplot(1:12, V2[,1:4], type="l", lwd = 2,
        main = "NA-CORDEX")
matplot(1:12, V[,1:4], type="l", lwd = 2,
        main = "NSRDB")
mtext("First 4 columns of V", outer = TRUE, line = 1, cex = 2)

Below are several other plots derived from the SVD to compare the two datasets. The first two plots show the first four columns of the matrix \(B = VD\) and can be thought of as the coefficients of our basis functions. The bottom two plots show the amount of variability described in the first \(n\) components. The patterns are similar which is encouraging that the two datasets capture similar seasonal and spatial variability.

#layout(matrix(1:8, 2, 4, byrow = TRUE), respect = TRUE)
par(omi=c(0,0,0,0), mar=c(2, 2, 0, 0), mfrow = c(2,2))

matplot(1:12, B2[,1:4], type = "l")
matplot(1:12, B[,1:4], type = "l")

plot(diag(D)^2, log = "y")
plot(diag(D2)^2, log = "y")

Linear model relating rsds and GHI

We can fit a linear model regressing NSRDB values on the NA-Cordex data to determine if there is an sort of linear relationship between the two values.

mod <- lm( c(nsrdb_order$GHI) ~ c(nacordex_order$rsds))
summary(mod)
## 
## Call:
## lm(formula = c(nsrdb_order$GHI) ~ c(nacordex_order$rsds))
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -131.686   -8.524    2.975   11.907   55.144 
## 
## Coefficients:
##                          Estimate Std. Error t value Pr(>|t|)    
## (Intercept)            -21.833585   0.503292  -43.38   <2e-16 ***
## c(nacordex_order$rsds)   0.952841   0.001848  515.50   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 16.18 on 7702 degrees of freedom
## Multiple R-squared:  0.9718, Adjusted R-squared:  0.9718 
## F-statistic: 2.657e+05 on 1 and 7702 DF,  p-value: < 2.2e-16
par(mfrow=c(2,2))
plot(mod)

Using the results of the linear model, we can predict get an estimate of NSRDB from the NA-CORDEX variables:

x <- cbind(rep(1, length(nacordex_order$rsds)), nacordex_order$rsds)
nacordex_order$ghi_pred <- c(mod$coefficients %*% t(x))

# Plot the raw difference
diff <- matrix(NA, ncol = 12, nrow = 642)
par(mfrow=c(3,4),
    mar = c(1,1,1,1),
    mai=c(0.35,0.35,0.35,0.6),
    oma = c(0, 0, 2, 0))
for (i in 1:12) {
  diff[,i] <- nsrdb_order[nsrdb_order$Month==i,]$GHI-nacordex_order[nacordex_order$month==month.abb[i],]$ghi_pred
 quilt.plot(x=lon_ord, y=lat_ord,
          z=nsrdb_order[nsrdb_order$Month==i,]$GHI -
            nacordex_order[nacordex_order$month==month.abb[i],]$ghi_pred,
          main = month.abb[i],
 zlim=c(-131.68559, 55.14376)
          )
  mtext("Difference between GHI and predicted", outer = TRUE, cex = 1.5)
 
}

par(mfrow=c(3,4),
    mar = c(1,1,1.8,1),
    mai=c(0.35,0.35,0.35,0.35),
    oma = c(0, 0, 2, 0))
for (i in 1:12) {
  hist(diff[,i], main = month.abb[i], 
       xlab = expression(paste(theta, "-", hat(theta))))
  mtext("Difference histograms", outer = TRUE, cex = 1.5)
 
}

Sacramento, California

We could also consider a single location within California, keeping in mind that the comparison will be between the 4x4km square and the 22kmx22km square at that location.

sac_all_years <- ca_df_rsds_nacordex_monthly[ca_df_rsds_nacordex_monthly$lon == sac_loc[1],]
sac_mat <- matrix(sac_all_years$rsds, ncol= 12, byrow=TRUE)
matplot(t(sac_mat), type = "l", main = "RSDS in Sacramento: 1998-2010",
        ylab= "rsds", xlab = "Month", col = viridis::inferno(13),
        lwd = 2)

Across all years in California, the solar radiation in Sacramento seems to be fairly similar across years. Though we might not see the drastic changes in this time period as it is 12 years of data for one spot.