1 The data

raw_112_64 <- read_raw_dataframes("campaign_4/112_64/timing*.dat")
raw_008_64 <- read_raw_dataframes("campaign_4/008_64/timing*.dat")

events_112_64 <- make_events_df(raw_112_64)
events_008_64 <- make_events_df(raw_008_64)

We are looking at a run using 16 nodes running the HEPnOS daemon, with 512 targets.

We use two runs of the eventselection program: 1. 112 nodes each running 64 ranks. 2. 8 nodes each running 64 ranks.

The dataset used is the 1929 subrun sample from the NOvA ND.

There are 3979542 events in the data sample.

3 Breakdown of running time

ebr_112_64 <- events_112_64 %>%
  group_by(rank) %>% 
  summarize(nevents = n(),
            load=sum(load),
            rec=sum(rec),
            filt=sum(filt),
            nslices=sum(nslices),
            nbytes=sum(nbytes),
            .groups = "drop")
ebr_008_64 <- events_008_64 %>%
  group_by(rank) %>% 
  summarize(nevents = n(),
            load=sum(load),
            rec=sum(rec),
            filt=sum(filt),
            nslices=sum(nslices),
            nbytes=sum(nbytes),
            .groups = "drop")

ebr <- bind_rows("8" = ebr_008_64, "112" = ebr_112_64, .id="run")
ebr$run <- as.integer(ebr$run)

summary(ebr_112_64)

##       rank         nevents           load            rec         
##  Min.   :   0   Min.   :334.0   Min.   :100.5   Min.   :0.03735  
##  1st Qu.:1792   1st Qu.:500.0   1st Qu.:145.6   1st Qu.:0.06932  
##  Median :3584   Median :599.0   Median :172.7   Median :0.08003  
##  Mean   :3584   Mean   :555.2   Mean   :164.5   Mean   :0.07603  
##  3rd Qu.:5375   3rd Qu.:600.0   3rd Qu.:183.7   3rd Qu.:0.08279  
##  Max.   :7167   Max.   :600.0   Max.   :198.1   Max.   :0.09279  
##       filt           nslices         nbytes       
##  Min.   :0.8785   Min.   : 729   Min.   :1104288  
##  1st Qu.:4.1038   1st Qu.:2301   1st Qu.:3537753  
##  Median :4.6966   Median :2591   Median :3980346  
##  Mean   :4.4621   Mean   :2494   Mean   :3835275  
##  3rd Qu.:4.9446   3rd Qu.:2759   3rd Qu.:4245807  
##  Max.   :6.0373   Max.   :3194   Max.   :5154600

summary(ebr_008_64)

##       rank          nevents          load            rec        
##  Min.   :  0.0   Min.   :7182   Min.   :103.7   Min.   :0.9577  
##  1st Qu.:127.8   1st Qu.:7717   1st Qu.:107.6   1st Qu.:1.0537  
##  Median :255.5   Median :7784   Median :109.7   Median :1.0683  
##  Mean   :255.5   Mean   :7773   Mean   :110.9   Mean   :1.0678  
##  3rd Qu.:383.2   3rd Qu.:7832   3rd Qu.:113.8   3rd Qu.:1.0824  
##  Max.   :511.0   Max.   :7972   Max.   :131.6   Max.   :1.2927  
##       filt          nslices          nbytes        
##  Min.   :44.00   Min.   :22521   Min.   :34735164  
##  1st Qu.:65.13   1st Qu.:34510   1st Qu.:53028810  
##  Median :67.04   Median :35562   Median :54653988  
##  Mean   :66.49   Mean   :34919   Mean   :53693848  
##  3rd Qu.:68.73   3rd Qu.:36214   3rd Qu.:55694382  
##  Max.   :96.10   Max.   :38475   Max.   :59072916

3.1 Plots

The scaling for the filtering step scales nearly perfectly. We can see this by looking at the number of events per second processed by each rank. We do this as a function of the number of events processed by the rank, to see if there is any variation.

ggplot(ebr, aes(nevents, nevents/filt)) +
  geom_smooth(method="lm", formula="y~x") +
  geom_point(alpha=0.3) +
  facet_wrap(vars(run), scales="free_x") +
  labs(x="Number of events processed by the rank",
       y="events/sec for each rank",
       title="Per-rank throughput of filtering step for 8 and 112 node runs.")

The loading does not scale well. Recall that in each case we are using a HEPnOS daemon with 512 targets. This means the 8 node run has only 1 client rank per target, while the 112 node run has 14 client ranks per target.

ggplot(ebr, aes(nevents, nevents/load)) +
  geom_smooth(method="lm", formula="y~x") +
  geom_point(alpha=0.3) +
  facet_wrap(vars(run), scales="free_x") +
  labs(x="Number of events loaded by the rank",
       y="events/sec for each rank",
       title="Per-rank throughput of data loading step for 8 and 112 node runs.")

We can also look at the aggregate loading rate for the whole program. I am not sure what is the best way to define this. What I have done here is to sum the rate for each rank in the program, giving a throughput in events/second.

ebr %>% group_by(run) %>%
  summarize(throughput=sum(nevents/load), .groups="drop") %>%
  knitr::kable()

run	throughput
8	35928.85
112	24483.10

Scaling analysis of per-rank performance of eventselection

Marc Paterno

2020-10-19

1 The data

2 Total job run time

3 Breakdown of running time

3.1 Plots