Checkpointing¶
A useful option, especially for long runs, is to be able to store the state of pocoMC in a file and also the to use
that file in order to later continue the same run. This can help avoid disastrous situations in which a run is interrupted
or terminated prematurely (e.g. due to time limitation in computing clusters or possible crashes).
Fortunately, pocoMC offers both options to save and load a previous state of the sampler.
Save¶
In order to save the state of the sampler during the run, one has to specify how often to save the state in a file. This is
done using the save_every argument in the run method. The default is save_every=None which means that no state
is saved during the run. If instead we want to store the state of pocoMC every e.g. 3 iterations, we would do
something like:
sampler.run(save_every = 3)
The default directory in which the state files are saved is a folder named states in the current directory. One can change
this using the output_dir argument when initialising the sampler (e.g. output_dir = "new_run"). By default, the state
files follow the naming convention pmc_{i}.state where i is the iteration index. For instance, if save_every=3 was
specified then the output_dir directory will include the files pmc_3.state, pmc_6.state, etc. One can also change
the label from pmc to anything else by using the output_label argument when initialising the sampler (e.g.
output_label="grav_waves").
Load¶
Loading a previous state of the sampler and resuming the run from that point requires to provide the path to the specific state
file to the run method using the resume_state_path argument. For instance, if we want to continue the run from the
pmc_3.state which is in the states directory, we would do:
sampler.run(resume_state_path = "states/pmc_3.state")
Load and Add More Samples¶
It is possible to add more samples to a finished run. This is useful when one wants to experiment with small runs until they get their analysis right, and then increase the number of required posterior samples to get publication-quality results. When save_every is not None, pocoMC will save a final file when sampling is done. By default, this is called pmc_final.state. We can load this state and change the termination criteria in order to add more samples, as follows:
sampler.run(n_total=16384, # This is the number of samples we want to draw in total, including the ones we already have.
n_evidence=16384, # This is the number of samples we want to draw for the evidence estimation.
resume_state_path = "states/pmc_final.state")
In this case, we chose to terminate sampling when the total ESS exceeds n_total=16384, which is higher than the default value of n_total=4096. Furthermore, we also provided a higher number of samples used for the evidence estimation. This means that the new evidence estimate will be more accurate than the original. However, could have chose to set n_evidence=0 and only added more posterior samples.