R user meeting Sep. 2022

An-Chi Ho and Eva Rifà || Contributor: Chihchung Chou

Agenda

  1. Ice-breaker: RStudio::conf
  2. News
    • General R
    • s2dv
    • ClimProjDiags
    • startR
    • CSTools
  3. User presentation: CSIndicators [Chihchung Chou]
  4. Q&A

Ice-breaker

RStudio::conf

  • Talk recordings and workshop materials from rstudio::conf(2022)

  • Also some talks are on youtube

  • Some announcements:

    • RStudio is changing its name to Posit
    • Use Quarto for creating content with Python, R, Julia, and Observable
    • New developments in Shiny: Shiny for Python, Shiny without a server, a visual Shiny UI editor, and more
    • Updates from the tidymodels and vetiver teams
    • Other topics

RStudio::conf

Some interesting talks…

General R

Individual R user meeting

  • Let’s start! Choose your time slot here
  • Meet with An-Chi and Eva to discuss your R scripts and any R-related topics
  • What you need to do:
    • Answer a short questionnaire about the usage of R tools
    • (preferrably) Prepare one or a few R scripts and explain to us; the goal is to see if there is any problem or space for improvement, either on your script or on the tools
    • Besides the scripts, any questions or suggestions about the tools are welcomed

Individual R user meeting (cont.)

  • The meeting is voluntary while we encourage you to have it even if you only use a bit of the tools. It gives us a chance to understand your needs and develop suitable tools. If you’re an experienced user, we’d like to have your opinions to make better tools.

s2dv::Load or startR::Start?

  • These are the two main tools in our department to load netCDF data (R-wise)
  • The priority of Load() is low as long as Start() can achieve the same goal.
  • Load() is still used by CST_Load(). But you can load data with Start() then transform the class to s2dv_cube by CSTools::as.s2dv_cube.
  • as.s2dv_cube() is not perfect transforming startR object though.

s2dv::Load or startR::Start? (cont.)

→ Talk to me if you have a strong reason using Load(), so I can priorotize the development.

Tip: To read files individually, using easyNCDF or ncdf4 package is easier.

s2dv

PloyEquiMap() bugfixes

When longitude is not continous (e.g., c(0:10, 350:360))

Status: Fixed and merged in master

source('https://earth.bsc.es/gitlab/es/s2dv/-/raw/master/R/PlotEquiMap.R')

DiffCorr() siginficance test

DiffCorr() significance test method changed. Three options: “two-sided”, “one-sided-higher”, and “one-sided-lower”.

     if (type == 'two-sided') {
        output$sign <- ifelse(!is.na(p.value) & p.value <= alpha/2, TRUE, FALSE)
      } else if (type == 'one-sided-higher') { 
        output$sign <- ifelse(!is.na(p.value) & p.value <= alpha & output$diff.corr > 0, TRUE, FALSE)
      } else if (type == 'one-sided-lower') { 
        output$sign <- ifelse(!is.na(p.value) & p.value <= alpha & output$diff.corr < 0, TRUE, FALSE)
      }

Status: in branch develop-DiffCorr-twosided

New functions

  • CRPS: Compute the Continuous Ranked Probability Score
  • CRPSS: Compute the Continuous Ranked Probability Skill Score

Status: in master

  • Bias: Compute the Mean Bias
  • BiasSS: Compute the Mean Bias Skill Score

Status: in branch develop-Bias

dat_dim is added in several functions

  • Functions that had dat_dim parameter argument (default dat_dim = ‘dataset’) indicating dataset dimension and now allow it to be NULL: Ano_CrossValid(), Corr(), RMS(), RMSS() and UltimateBrier().
RMS <- function(exp, obs, time_dim = 'sdate', dat_dim = 'dataset',
                comp_dim = NULL, limits = NULL, conf = TRUE, conf.lev = 0.95,
 ncores = NULL) { … }  
  • Functions which dat_dim has been added (default dat_dim = NULL): RPS() and RPSS().
RPS <- function(exp, obs, time_dim = 'sdate', memb_dim = 'member', dat_dim = NULL,
                prob_thresholds = c(1/3, 2/3), indices_for_clim = NULL, Fair = FALSE,
                weights = NULL, ncores = NULL) { … }

dat_dim is added in several functions (cont.)

  • New functions CRPS() and CRPSS() also have dat_dim parameter (default dat_dim = NULL).

Status: In master

ClimProjDiags

New function ShiftLon()

Shift global longitudes of a data array. Useful for map plotting or aligning datasets.

ShiftLon <- function(data, lon,  westB, lon_dim = 'lon', ncores = NULL) {
}

Status: In master

New function ShiftLon() (cont.)

Subset bugfixes

  • Bugfix for the case that only one dimension left after subsetting. It returned error before.
source("https://earth.bsc.es/gitlab/es/ClimProjDiags/-/raw/master/R/Subset.R")
res <- Subset(array(1:20, dim = c(dat = 1, lat = 4, lon = 5)), along = "lat", indices = 1, drop = T)
dim(res)
#lon
#  5
  • Bugfix when param indices is not a list. It returned wrong output before.
Subset(array(1:20, dim = c(dat = 1, lat = 4, lon = 5)), 
       along = c(‘lat’, ‘lon’), 
       indices = c(1, 1), 
       drop = T)

Status: In master

startR

New version 2.2.0-2 installed

NEWS - Use the destination grid to decide which indices to take after interpolation. - Bugfix when Start() parameter “return_vars” is not used. - Allow netCDF files to not have calendar attributes (force it to be standard calendar)

CSTools

New release on the way…

EXPECTED NEWS - Change dependency on package s2dverification to s2dv - Bugfix of as.s2dv_cube() for startR object - CST_QuantileMapping improvement - Improve flexibility of time dimensions in s2dv_cube() - CST_Calibration for forecast - CST_BiasCorrection new parameters for dimension flexibility

User presentation [Chung]

CSIndicators (agricultural indicators)

Q & A

Thanks for joining : )

Next meeting: 6th Oct 2022 (11 am)