- Race condition
- Synchronisation
29/10/2021
#pragma omp parallel default(none) private(mya) shared(a) { #pragma omp for for (int i=0;i<1000;i++) { ... ; } }
g++ -fopenmp prog.cxx -o myprogram
OMP_NUM_THREADS=4 ./myprogram
#include <cstdio> int main() { int a=10; #pragma omp parallel default(none) shared(a) { a = a+1; } printf("Final result a=%d\n", a); return 0; }
Compile and run with:
g++ -fopenmp rcdemo.cxx -o rcdemo ./rcdemo
Initialise a=10 in shared memory
Thread 1 loads 10 into CPU register
Thread 1 adds 1 to 10 in CPU register
Thread 2 loads 10 into CPU register
Thread 2 adds 1 to 10 in CPU register
Thread 1 stores 11 in shared memory a
Thread 2 stores 11 in shared memory a
Symptom: non-deterministic execution behaviour of code.
Solution: if multiple threads need to update a shared variable, the update must be protected (there are several ways in the OpenMP standard to do that).
Can be source of bugs that are hard to find and fix, especially in larger programs.
Avoid by sticking with default(none) and carefully thinking about which variables are shared and which are private.
Task: compute \(\sum\limits_{n=1}^{100} n\)
#include <cstdio> int main() { int sum=0; const int N=100; #pragma omp parallel for for (int n=1;n<N+1;n++) { sum +=n; // Race condition here } printf("Result is %d. It should be %d\n", sum, N*(N+1)/2); return 0; }
We can safely update a single shared variable using
#pragma omp atomic statement;
atomic protects a single address in shared memory.
atomic applies to a single code statement only.
atomic applies only to basic types.
statement must be of simple form like:
x+=1; x++; x--; x-=2; x*=2;
Task: compute \(\sum\limits_{n=1}^{100} n\)
#include <cstdio> int main() { int sum=0; const int N=100; #pragma omp parallel for for (int n=1;n<N+1;n++) { # pragma omp atomic sum +=n; // Race condition eliminated } printf("Result is %d. It should be %d\n", sum, N*(N+1)/2); return 0; }
For things not covered by atomic use #pragma omp critical
critical ensures that only one thread at a time executes block of code.
critical is typically more expensive than atomic
#pragma omp critical { statement1; statement2; ... }
A reduction produces a single value from operations like addition, multiplication, min, max, and, or.
Allows each thread to reduce into a private copy — and then reduce those private copies to get final result.
Syntax similar to shared/private but you need to give the operator, too:
#pragma omp parallel for reduction(+:mysum) { for(int i=0;i<100;i++) mysum+=i; }