This is an R Markdown Notebook for exploring the performance of the lua-protobuf library.
Load the libraries we will use:
library(readr)
library(dplyr)
Attaching package: 'dplyr'
The following objects are masked from 'package:stats':
filter, lag
The following objects are masked from 'package:base':
intersect, setdiff, setequal, union
library(ggplot2)
Load the data produced by protobuf_bench.lua:
data <- read_delim("https://gist.githubusercontent.com/lukego/504dc8d35fd55e96f3f55be6b493d284/raw/82e716ec71ce1d97cbf79b64183d31b9a157bad3/lua-protobuf-bench.csv", "!")
Parsed with column specification:
cols(
filename = col_character(),
bytes = col_integer(),
load.sec = col_double(),
decode.sec = col_double(),
encode.sec = col_double()
)
data$loops = 100000 # (This field was missing in the version that produced this data)
data <- mutate(data, msg.usec = (load.sec + decode.sec + encode.sec) * 1000000 / loops)
data <- filter(data, bytes<10240) # Consider only "small-medium" messages < 10K
Check overall percentage of time loading data verses decoding and encoding:
loadtime <- sum(data$load.sec)
decodetime <- sum(data$decode.sec)
encodetime <- sum(data$encode.sec)
totaltime <- loadtime + decodetime + encodetime
list(load.percent = loadtime*100/totaltime,
decode.percent = decodetime*100/totaltime,
encode.percent = encodetime*100/totaltime)
$load.percent
[1] 13.46786
$decode.percent
[1] 51.55125
$encode.percent
[1] 34.9809
Plot microseconds elapsed verses message size with a linear regression line:
ggplot(data, aes(y = msg.usec, x = bytes)) +
geom_point(color = "blue", alpha = 0.5) +
geom_smooth(method="lm", se=F, color = "red", alpha=0.5) +
labs(y = "microseconds per message (read+decode+encode)",
x = "protobuf message size in bytes")
Check the slope and fit of the linear model quantitatively too:
model <- lm(bytes ~ msg.usec, data=data)
summary(model)
Call:
lm(formula = bytes ~ msg.usec, data = data)
Residuals:
Min 1Q Median 3Q Max
-1028.09 -74.98 -13.11 32.51 2195.30
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 50.6828 13.6495 3.713 0.000231 ***
msg.usec 22.6836 0.2696 84.133 < 2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 242.4 on 436 degrees of freedom
Multiple R-squared: 0.942, Adjusted R-squared: 0.9418
F-statistic: 7078 on 1 and 436 DF, p-value: < 2.2e-16
Linear model seems to fit pretty well both visually and with Rsquared.