1 The data

We are looking at a run using 16 nodes running the HEPnOS daemon, with 512 targets, and 112 nodes running eventselection, each node running 64 ranks. The dataset used is the 1691 subrun sample from the NOvA ND.

raw <- readRDS("theta_es_7168_2020-06-30_01.rds")
global <- make_global_df(raw)
events <- make_events_df(raw)

2 Total job run time

The total job run time, according to the batch job log, is ~574 seconds. About 100 seconds seems to be consumed by MPI startup and shutdown.

Looking only at the timing while the MPI programming is running, we see the distribution of total run times by rank:

ggplot(global, aes(total)) +
  geom_histogram(bins=50) +
  labs(x="total running time", y="number of ranks")

3 Breakdown of running time

We can get a more detailed breakdown of the running time per rank by looking at the event-level data, summing the times in the different steps of data processing for each event handled by a given rank.

ebr <- 
  events %>%
  group_by(rank) %>%
  summarize(nevents=n(),
            load=sum(load),
            rec=sum(rec),
            filt=sum(filt),
            nslices=sum(nslices),nbytes=sum(nbytes),
            .groups = "drop")
ebr

3.1 Summary

summary(ebr)
##       rank         nevents           load            rec         
##  Min.   :   0   Min.   :244.0   Min.   :109.2   Min.   :0.02844  
##  1st Qu.:1792   1st Qu.:500.0   1st Qu.:167.3   1st Qu.:0.06572  
##  Median :3584   Median :600.0   Median :185.0   Median :0.07577  
##  Mean   :3584   Mean   :555.2   Mean   :181.3   Mean   :0.07190  
##  3rd Qu.:5375   3rd Qu.:600.0   3rd Qu.:197.9   3rd Qu.:0.07815  
##  Max.   :7167   Max.   :600.0   Max.   :216.9   Max.   :0.08630  
##       filt          nslices         nbytes       
##  Min.   :1.248   Min.   : 765   Min.   :1169016  
##  1st Qu.:4.481   1st Qu.:2294   1st Qu.:3529803  
##  Median :5.125   Median :2608   Median :4006770  
##  Mean   :4.871   Mean   :2494   Mean   :3835275  
##  3rd Qu.:5.409   3rd Qu.:2766   3rd Qu.:4251654  
##  Max.   :6.076   Max.   :3131   Max.   :4946472

The loading time dominates the processing. The time (rec) it takes to transform the HEPnOS-related format to the Standard Record format is negligible.

3.2 Plots

ggplot(ebr, aes(nevents, filt)) +
  geom_smooth(method="lm", formula="y~x") +
  geom_point(alpha=0.3)

ggplot(ebr, aes(nevents, rec)) +
  geom_smooth(method="lm", formula="y~x") +
  geom_point(alpha=0.3)

ggplot(ebr, aes(nevents, load)) +
  geom_smooth(method="lm", formula="y~x") +
  geom_point(alpha=0.3)