Scheduling jobs¶
Often in practice the objective function is a python script that might take command line arguments as parameters or define a function that has lots of dependencies.
Importing this function into the hyperparameter optimisation script or wrapping the target script involves some boilerplate code.
To help with that Hypertunity allows for specifying objective functions as Job
instances which are then run in succession or in parallel using a Scheduler
.
The latter is a wrapper around joblib and takes care of both running jobs and collecting results.
Scheduling of Job
instances is done using the dispatch
method of a Scheduler
:
jobs = [Job(...) for _ in range(10)]
scheduler.dispatch(jobs)
evaluations = [r.data for r in scheduler.collect(n_results=batch_size, timeout=10.0)]
There are multiple ways to define a job depending on the target to optimise.
Local python callable¶
If the function is defined or imported within the hyperparameter optimisation script, the task
argument is the callable instance.
The args
is then a tuple of arguments or a dict of named arguments which are supplied to the task function during calling.
For example:
jobs = [ht.Job(task=foo, args=(*s.as_namedtuple(),)) for s in samples]
Python callable in a script¶
If the function to optimise resides in a script, Hypertunity allows for specifying a target by the full path to the script.
To select the objective function from the script append :
and the function name:
jobs = [Job(task="path/to/script.py:foo", args=(*s.as_namedtuple(),)) for s in samples]
A script¶
If the objective function is a full command line application or a script that accepts the hyperparameters to tune as command line arguments then you should create a job as follows:
jobs = [Job(task="path/to/script.py",
args=(*s.as_namedtuple(),),
meta={"binary": "python"}) for s in samples]
Using Slurm¶
To schedule jobs using Slurm a special job type is available. It allows to configure resources and other Slurm parameters but also requires that the target script is able to write a results file on disk.
jobs = [SlurmJob(task="path/to/script.py",
args=(*sample.as_namedtuple(),),
output_file="path/to/results.pkl",
meta={"binary": "python", "resources": {"cpu": 1}}))