Parallelization¶
If you want to run computations in parallel, pocoMC can use a user-defined pool to execute a variety of expensive operations
in parallel rather than in serial. pocoMC allows for both internal and external parallelization options.
Internal Parallelization¶
The simplest way to parallelize pocoMC, especially when running on a single machine (e.g., laptop or single CPU node on an HPC cluster) is to use the internal parallelization offered by pocoMC. This option essentially relies on the multiprocess package to perform the computation of the likelihood function for all active particles in parallel.
To achieve this, the user simply has to provide the desired number of CPU processes. This should not exceed the number of available physical CPUs cores (e.g., 12 for a modern MacBook Pro). The number of processes is provided through the pool argument during the initialization of the sampler class:
import pocomc as pc
sampler = pc.Sampler(prior, log_likelihood, pool=10) # For 10 parallel processes
External Parallelization¶
Alternatively, the user can provide an external pool to use instead of the internal multiprocess one.
SMP¶
If you have an external shared-memory multiprocessing (SMP) pool that you want to use instead of the internal multiprocess one, then you can provide it through the pool argument. For instance, to use the multiprocessing pool, one would do:
from multiprocessing import Pool
import pocomc as pc
with Pool(10) as pool:
sampler = pc.Sampler(prior, log_likelihood, pool=pool) # For 10 parallel processes
MPI¶
When running on a High-Performance Computing (HPC) cluster with multiple nodes of many CPUs each, it may be beneficial to use Message Passing Interface (MPI) parallelization. A simple way to achieve this is using the provided MPIPool as follows. Please note that you will need to have mpi4py installed to use this option.
import pocomc as pc
if __name__ == '__main__':
with pc.parallel.MPIPool() as pool:
sampler = pc.Sampler(prior, log_likelihood, pool=pool)
sampler.run()
The above script should be executed via mpiexec -n 256 python script.py where 256 is the number of processes.
Notes¶
Since numpy is doing some internal parallelisation using OpenMP, it is a good idea to limit this to a single CPU when running pocoMC in parallel in order to avoid any unwanted interference. To do this, one can deactivate OpenMP manually using the following at the begining of their code:
import os
os.environ["OMP_NUM_THREADS"] = "1"
Note also that parallelization incures some non-negligible communication overhead. For most applications, this overhead is minimal and only contributes to small increase in the total run time. However, if the cost of evaluating the likelihood function is really low (i.e., usually less than 10 ms), then the computational overhead may be comparable to that cost. As a result, parallelization is a good idea only when the likelihood function is more expensive than the overhead time.