1. Description

Results presented in this report correspond to the work: “CMI: An Online Multi-objective Genetic Autoscaler for Scientific and Engineering Workflows in Cloud Infrastructures with Unreliable Virtual Machines” that you can find here.

Abstract: Cloud Computing is becoming the leading paradigm for executing scientific and engineering workflows. The large-scale nature of the experiments they model and their variable workloads make clouds the ideal execution environment due to prompt and elastic access to huge amounts of computing resources. Autoscalers are middleware-level software components that allow scaling up and down the computing platform by acquiring or terminating virtual machines (VM) at the time that workflow’s tasks are being scheduled. In this work we propose a novel online multi-objective autoscaler for workflows denominated Cloud Multi-objective Intelligence (CMI), that aims at the minimization of makespan, monetary cost and the potential impact of errors derived from unreliable VMs. In addition, this problem is subject to monetary budget constraints. CMI is responsible for periodically solving the autoscaling problems encountered along the execution of a workflow. Simulation experiments on four well-known workflows exhibit that CMI significantly outperforms a state-of-the-art autoscaler of similar characteristics called Spot Instances Aware Autoscaling (SIAA). These results convey a solid base for deepening in the study of other meta-heuristic methods for autoscaling workflow applications using cheap but unreliable infrastructures.

2. Results

Scatter plot of all results.

This scatter plot shows normalized makespan and cost for CMI and SIAA (all of its configurations) and the four studied workflows.

Summary

Mean values per workflow and strategy.

workflow strategy mean(makespan_norm) mean(totalCost_norm) mean(l2Distance)
CyberShake-1000 CMI 0.1679785 0.1135412 0.2180839
CyberShake-1000 SIAA 0.0929246 0.6890614 0.7130178
LIGO-1000 CMI 0.1033632 0.0681047 0.1415752
LIGO-1000 SIAA 0.1788259 0.6345928 0.7198810
Montage-1000 CMI 0.0762116 0.0451403 0.0992657
Montage-1000 SIAA 0.2749714 0.7185512 0.8024100
psmerge-841 CMI 0.3627461 0.1198194 0.3870419
psmerge-841 SIAA 0.2489149 0.5387598 0.6251324

Histograms of normalized makespan

Histograms of normalized cost

Histograms of L2 distance

Distance scatter plots per workflow

Scatter plots of makespan and cost. L2 distance is represented by the size of the shapes.

Metrics Summary

For each combination of workflow and strategy, tables present: mean, median, SD, min, max and if the data is normally distributed according to the Shapiro-Wilk test with confidence level of 0.001.

Makespan summary
workflow strategy mean median stdev min max normal
CyberShake-1000 CMI 19940.27 18999.33 2830.442 16676.33 30702.00 FALSE
CyberShake-1000 SIAA 18351.60 17810.33 2275.895 16384.67 37551.67 FALSE
LIGO-1000 CMI 96731.35 90570.00 17183.338 86295.00 150941.00 FALSE
LIGO-1000 SIAA 104379.84 90049.67 25094.512 86255.00 187609.67 FALSE
Montage-1000 CMI 28257.60 27303.33 1868.436 26371.67 32642.00 FALSE
Montage-1000 SIAA 33176.11 31813.67 2998.022 28391.00 51117.67 FALSE
psmerge-841 CMI 1841397.19 1850905.50 473430.637 1088706.00 3189958.00 TRUE
psmerge-841 SIAA 1600506.95 1549057.67 243573.927 1073751.67 2333612.00 FALSE
Cost summary
workflow strategy mean median stdev min max normal
CyberShake-1000 CMI 9.446461 8.99855 2.1182871 6.4144 15.3563 TRUE
CyberShake-1000 SIAA 28.333710 32.91900 9.4483318 5.7203 38.5380 FALSE
LIGO-1000 CMI 67.010697 66.43943 10.3340607 50.2386 88.7052 TRUE
LIGO-1000 SIAA 206.519398 230.54612 71.8861621 50.7079 296.5080 FALSE
Montage-1000 CMI 2.609807 2.52165 0.6020713 1.8090 4.2644 TRUE
Montage-1000 SIAA 18.484323 22.70700 6.6143812 1.5457 25.1190 FALSE
psmerge-841 CMI 3115.378236 3088.03540 767.9219733 1981.9404 5002.8504 TRUE
psmerge-841 SIAA 7078.366103 7596.52592 1990.2360597 2083.6192 11441.4906 FALSE
L2 summary
workflow strategy mean median stdev min max normal
CyberShake-1000 CMI 0.2180839 0.1664743 0.1239861 0.1037550 0.6886193 FALSE
CyberShake-1000 SIAA 0.7130178 0.8344142 0.2635910 0.0818010 1.0061122 FALSE
LIGO-1000 CMI 0.1415752 0.0881913 0.1600229 0.0056403 0.6390915 FALSE
LIGO-1000 SIAA 0.7198810 0.8524873 0.2508280 0.0166427 1.1450096 FALSE
Montage-1000 CMI 0.0992657 0.0663306 0.0653927 0.0227788 0.2536887 FALSE
Montage-1000 SIAA 0.8024100 0.9166872 0.2035659 0.1759440 1.0674611 FALSE
psmerge-841 CMI 0.3870419 0.3908536 0.2294474 0.0073866 1.0111453 TRUE
psmerge-841 SIAA 0.6251324 0.6625321 0.1375450 0.2738938 1.0342617 FALSE

3. Significance Tests

Wilcoxon test was applied to check the significance of results. This is a non-parametric test in which the null hypothesis is that two compared samples come from the same distribution

If p-value < significance, then reject null hypothesis (i.e. samples come from different distributions).

For this work, the confidence level is 0.001

Interpretation of test results

Test result can be:
- better means that CMI significantly outperformed SIAA
- same means that CMI and SIAA are not significantly different
- worst means that SIAA significantly outperformed CMI

General summary of tests

Includes counts of tests including all workflows (4) and variants of SIAA (11).

analysis amount
better 41
same 3

Note that from the 44 tests none of them resulted in worst. Meaning that CMI was never significantly outperformed by SIAA

Detail of the cases for which CMI and SIAA are not significantly different.

Cases in which CMI and SIAA are not significantly different.
index workflow metric spotsRatio SIAA CMI signif.diff analysis improvementPrc
120 psmerge-841 l2Distance Constant:0.9 0.3866944 0.3908536 FALSE same -0.0041592
129 CyberShake-1000 l2Distance Constant:1.0 0.1413727 0.1664743 FALSE same -0.0251016
132 psmerge-841 l2Distance Constant:1.0 0.4234930 0.3908536 FALSE same 0.0326394

Scatter plots of simulations for the cases in which CMI and SIAA are not significantly different.