Genz function 1 in 8d

Purpose of this document

This document shows a performance comparison between the serial and parallel implementations of the CUHRE algorithm for a non-positive-definite integrand.

The integrand chosen is: \[ \cos(s + 2 t + 3 u +4 w +5 w + 6 x +7 y + 8 z) / k\] with \[ k = (1/315) \sin(1) \sin(3/2) \sin(2) \sin(5/2) \sin(3) \sin(7/2) \sin(4) (\sin(37/2) - \sin(35/2)). \]

\(k\) is approximately equal to 3.43955795218325e-05.

Testing environment

These tests were run on ibmpower9.fnal.gov.

The Power9 machine used in these tests is a PowerNV 8335-GTG (AC 922) system. It has two processors, each with 8 cores. Each core has 4 slices.

Description of the dataframe

alg: the name of the algorithm (cuhre is serial; gpucuhre is parallel)
epsrel: the fractional error target
value: the estimated value of the integral
errorest: the estimated error for the result
error: the absolute difference between the estimated value and the true value
neval: the number of function evaluations used
nregions: the number of regions used
time: the time in milliseconds for the calculation
r: ratio of (errorest/(epsrel*value)); this should be less than 1 if the algorithm has converged

A value of NA indicates that the algorithm did not converge, but rather stopped because the maximum number of function evaluations had been reached.

alg	epsrel	value	errorest	error	neval	nregions	time	r
cuhre	1.0000e-03	0.9999127	0.0009998640	8.73389e-05	48526075	21958	26820.71	0.9999513
cuhre	5.0000e-04	1.0000750	0.0005000200	7.50680e-05	79205295	35840	44249.74	0.9999650
cuhre	2.5000e-04	1.0000490	0.0002500074	4.88586e-05	124311395	56250	70582.94	0.9999806
cuhre	1.2500e-04	1.0000140	0.0001250007	1.35957e-05	202753135	91744	121709.60	0.9999916
cuhre	6.2500e-05	1.0000040	0.0000625000	4.13070e-06	335613915	151862	230264.20	0.9999955
cuhre	3.1250e-05	1.0000010	0.0000312499	1.45540e-06	548874495	248360	451179.80	0.9999961
cuhre	1.5625e-05	0.9999997	0.0000156250	2.66500e-07	882520405	399331	935296.90	0.9999971
cuhre	7.8125e-06	1.0000010	0.0000078125	8.65700e-07	1388581675	628318	2060807.00	0.9999987
cuhre	3.9062e-06	1.0000000	0.0000039062	1.85500e-07	2117439675	958118	4446506.00	0.9999990
cuhre	1.9531e-06	1.0000000	0.0000019531	1.61000e-07	3294176275	1490578	10027550.00	0.9999995

Analysis

We do not yet have results from the parallel algorithm.

Because the range of fractional error tolerance values is large, and the range of times is very large, we use log scales for our plot. The line is a fitted linear model; the fit is clearly poor.

The number of regions needed to obtain a given fractional error tolerance seems to be related in a power law to the reciprocal of the fractional error tolerance.

Genz function 1 in 8d

Marc Paterno

6/8/2020

Purpose of this document

Testing environment

Description of the dataframe

Analysis