speedGage

The speedGage application is implemented C with MPI. It is used to test worker speed by giving each worker a large task. The manager (first) thread is excluded from this test. The software is called by executing the following.

mpiexec -np N ./speed Lam

where '\( N \)' is the number of threads (including the master) to execute and 'Lam' \( (= \lambda) \) is a numerical parameter deciding the amount of work to be done.

For worker \( p_i \), the large task is to find the total number of primes under a value, \( X_i \). This value, \( X_i \), is random (to simulate potentially random-sized user requests) and distributed as \( Poisson(\lambda) \). This can be stated formally as follows.

\( \forall i > 0, X_i \sim_{iid} Poisson(\lambda) \)

Here are some examples of running the code on 3 nodes X 2 CPUs X 4 Cores (= 24 cores), \( lambda = 1234567 \). Notice how average calculation time increases once tasks exceeds workers.

threads = c(rep(5, 4), rep(10, 9), rep(15, 14), rep(20, 19), rep(25, 24), rep(30, 
    29), rep(35, 34), rep(40, 39))
t5 = rep(2, 4)
t10 = c(rep(2, 7), rep(1, 2))
t15 = c(rep(2, 13), rep(1, 1))
t20 = c(rep(2, 11), rep(1, 8))
t25 = c(rep(2, 15), rep(1, 9))
t30 = c(rep(3, 2), rep(2, 16), rep(1, 11))
t35 = c(rep(4, 1), rep(3, 2), rep(2, 20), rep(1, 11))
t40 = c(rep(4, 2), rep(3, 5), rep(2, 21), rep(1, 11))
# times = c( mean(t5) , mean(t10) , mean(t15) , mean(t20) , mean(t25) ,
# mean(t30) , mean(t35) , mean(t40))
times = c(t5, t10, t15, t20, t25, t30, t35, t40)
plot(jitter(threads), jitter(times), xlab = "Number of threads", ylab = "Calculation times (s)", 
    main = "Jitterred calculation times on 24 distributed cores")

plot of chunk unnamed-chunk-1

Here are some examples of running the code on a 2 core Intel i7.

threads = c(2, 3, 3, 5, 5, 5, 5, 7, 7, 7, 7, 7, 7)
times = c(22, 30, 31, 45, 41, 48, 46, 65, 67, 56, 61, 64, 63)
plot(as.factor(threads), times, xlab = "number of threads", ylab = "calculation times (s)", 
    main = "Increasing calculation times on two cores")

plot of chunk unnamed-chunk-2

$ mpiexec -np 2 ./speed 12345678
Sending 12349182 to worker 1
810023 primes below 12349182!
Worker 1 found all primes below 12349182 in 22 seconds

Sending 12349278 to worker 1
Sending 12340877 to worker 2
810029 primes below 12349278!
809516 primes below 12340877!
Worker 1 found all primes below 12349278 in 30 seconds
Worker 2 found all primes below 12340877 in 31 seconds

$ mpiexec -np 5 ./speed 12345678
Sending 12348677 to worker 1
Sending 12338196 to worker 2
Sending 12347511 to worker 3
Sending 12348262 to worker 4
809364 primes below 12338196!
809992 primes below 12348677!
809965 primes below 12348262!
809921 primes below 12347511!
Worker 1 found all primes below 12348677 in 45 seconds
Worker 2 found all primes below 12338196 in 41 seconds
Worker 3 found all primes below 12347511 in 48 seconds
Worker 4 found all primes below 12348262 in 46 seconds

$ mpiexec -np 7 ./speed 12345678
Sending 12341377 to worker 1
Sending 12344082 to worker 2
Sending 12352825 to worker 3
Sending 12344090 to worker 4
Sending 12347505 to worker 5
Sending 12347175 to worker 6
810245 primes below 12352825!
809547 primes below 12341377!
809702 primes below 12344090!
809700 primes below 12344082!
809921 primes below 12347505!
809901 primes below 12347175!
Worker 1 found all primes below 12341377 in 65 seconds
Worker 2 found all primes below 12344082 in 67 seconds
Worker 3 found all primes below 12352825 in 56 seconds
Worker 4 found all primes below 12344090 in 61 seconds
Worker 5 found all primes below 12347505 in 64 seconds
Worker 6 found all primes below 12347175 in 63 seconds