Launch parallel jobs with scatter
You can find this tutorial in the demos
folder of your Jupyter notebook environment.
- mpi_scatter.ipynb
Parallel computation is a valuable time-saving technique for research that does similar calculations on a large number of combinations.
The Camber create_scatter_job
function supports research on parameter grids. The scatter job works by receiving a table of parameters as input and then running parallel jobs for each item in each row. In some scientific circles, this technique is known as a parameter sweep.
All Camber engines support scatter jobs. As they run on Camber infrastructure, they can also compute data-intensive workloads on a massively parallel scale. The following sections provide some examples.
Run Hello, Solar System
with MPI
The example in Run your first job shows how to use the MPI engine to run a “Hello, World” with multiple processers.
However, there are many more places to greet than “World”, and many more greetings than “Hello.” With this scatter job, run a parallel job to give each planet in your parameters a set of greetings.
The first step is to import the MPI package:
import camber.mpi
Now, define your parameters:
params = {
"planet": [
"Mercury",
"Venus",
"Earth",
"Mars",
],
"greeting": [
"Bonjour",
"¡Hola",
],
}
Use the command_template
argument to create a scatter job with every combination of greeting and planet:
mpi_jobs = camber.mpi.create_scatter_job(
engine_size="XMICRO",
command_template="echo '{greeting}, {planet}!'",
template_params_grid=params,
)
Soon the jobs start running. Check the output for each job like this:
for job in mpi_jobs:
job.read_logs()
print("\r")
Quick reminder: you can also download the log files for every job by replacing read_logs()
with download_log()
.
Check understanding: how many jobs?
Note that params
dictionary has two keys, one with 4 values and the other 2.
How many total jobs are created?
The answer is the Cartesian product of the length of each key used in the command template―in this case, 8.