This is a condensed version of the material presented at the 2021 performance workshop tailored to the peano4 use case. Score-p provides profiling and event tracing through instrumentation of user code. We use scalasca to profile the score-p instrumented executable.
Links:
Score-p provides compiler wrappers that do the code instrumentation. The runtime behaviour of score-p is generally controlled through environment variables of the type SCOREP_. This pertains to the output directory (SCOREP_EXPERIMENT_DIRECTORY) as well. NOTE: Scalasca will take care of automatic outputidirectory generation.
There is a handy scripts that prints all available variables:
scorep-info config-vars --full
In the simplest case, replacing gcc with scorep –user gcc in all compilation steps will be sufficient. For autotools (CMake as well) projects some more care needs to be taken.
It is very important to turn off the compiler wrapper during the configure step and to –disable-dependency-tracking.
module purge
# Intel no <filesytem> support?
#module load intel_comp/2020-update2 intel_mpi/2020-update2
#module load scorep/7.0 - linker errors also with gnu10 get_location_from_adhoc_loc
module load gnu_comp/10.2.0 openmpi/4.0.5 scorep/6.0
SCOREP_WRAPPER=off CXX=scorep-g++ ./configure --with-mpi=scorep-mpicxx --with-multithreading=omp CXXFLAGS="-std=c++17 -fopenmp -march=native -O3" LDFLAGS="-fopenmp" --enable-loadbalancing --enable-exahype --enable-particles --enable-blockstructured --disable-dependency-tracking
make -j20
cd examples/exahype2/euler
export PYTHONPATH=$PWD/../../../python:$PYTHONPATH
python3 example-scripts/finitevolumes.py -cs 0.1 -f -et 0.0005
export SCOREP_TOTAL_MEMORY=16GB
scan mpiexec -np 2 ./peano4
square -s square -s ./scorep_peano4_2xO_sum
scan -q -t mpiexec -np 2 ./peano4
NOTE: the scorep/7.0 modules seem not to work. It is very easy to install it locally though as it tracks and installs its own dependencies for you:
wget http://perftools.pages.jsc.fz-juelich.de/cicd/scorep/tags/scorep-7.0/scorep-7.0.tar.gz
tar xzf scorep-7.0.tar.gz
cd scorep-7.0
./configure --prefix=$PWD/local
make install -j20
export PATH=$PWD/local/bin:$PATH
export LD_LIBRARY_PATH=$PWD/local/lib:$LD_LIBRARY_PATH
NOTE: It is a good idea to generously overestimate what value SCOREP_TOTAL_MEMORY to set to in order to prevent frustrating iteration. The reason being that scorep will only complain (and fail to provide output) at the end of the program if there is not enough memory.
After running square, the summary will tell you how much memory you need to do a full trace run. If the reported minimum is in excess of the available system memory, a filtering fill must be provided
ybd